Distributed media asset management ideas and distributed file and task call system introduction

1.Module introduction

At present, the main management objects of media asset management are videos, pictures, documents, etc., including: media asset file query, file upload, video processing, etc.

Media asset query: The system can query the media asset information it owns.

File upload: including uploading pictures, uploading documents, and uploading videos.

Video processing: The video is uploaded successfully and the system automatically encodes the video.

File deletion: The teaching institution deletes the media files uploaded by itself.

2.Business process

Mainly divided into: uploading pictures, uploading videos, and video processing

2.1 Process of uploading pictures

Insert image description here
1. Enter the image upload interface on the front end.
2. Upload images and request media asset management services.
3. The media asset management service stores image files in MinIO .
4. Media asset management records file information into the database.
5. The front end requests the content management service to save the course information and save the image address in the content management database .

2.2 Upload video process

Insert image description here
1. The front end divides the file into chunks .
2. Before uploading the chunked file, the front end requests the media asset service to check whether the file exists . If it already exists, it will not be uploaded again.
3. If the chunked file does not exist, the front-end starts uploading.
4. The front-end requests the media asset service to upload the chunks .
5. The media asset service will be uploaded to MinIO in chunks.
6. The front-end will request the media asset service to merge the chunks after the upload is completed .
7. When the media asset service determines that the multi-part upload is complete, it requests MinIO to merge the files.
8. After the merge is completed, verify whether the merged file is complete . If it is complete, the upload is complete, otherwise the file will be deleted.

2.3 Video processing flow

(The video transcoding tool used here is FFmpeg)
Insert image description here 1. The task scheduling center broadcasts job shards .
2. The executor receives the broadcast job fragments, reads pending tasks from the database , and reads unprocessed and failed tasks.
3. The executor updates the task to processing, and downloads the file to be processed from MinIO according to the task content .
4. The executor starts multiple threads to process tasks.
5. After the task processing is completed, upload the processed video to MinIO .
6. The task processing result will be updated . If the video processing is completed, in addition to updating the task processing result, the access address of the file will also be updated to the task processing table and file table, and finally the task completion record will be written to the history table.

3. Data model

1.3 Data model
The data tables related to media assets files in this module are as follows:
Insert image description here

Media asset file table : stores file information, including pictures, videos, documents, etc.
media_process : Video table to be processed, which stores video data to be transcoded (for example, .avi->mp4).
media_process_history : Video processing history table, recording video information that has been successfully processed.

4. Rough talk about the distributed technology adopted

The distributed technologies used this time are MinIo and XXL-Job.

4.1 Distributed file system (this time MinIo is used)

What is a file system?

The file system is the system software responsible for managing and storing files. The operating system accesses files through the interface provided by the file system, and the user accesses the files on the disk through the operating system. Common file systems: FAT16/FAT32, NTFS, HFS, UFS, APFS, XFS, Ext4, etc.
Insert image description here
And when you want to store massive files , how should you store them?

Distributed file systems are solutions for massive users to access massive files.
A distributed file system is a computer that cannot store a large number of files. Several computers are organized through the network to store a large number of files and receive requests from a large number of users. These organized computers communicate through the network.
Insert image description here
Since it is distributed processing, its benefits are:
1. The file system processing capability of one computer is expanded to be processed by multiple computers at the same time.
2. If one computer crashes, another copy of the computer will provide data.
3. Each computer can be placed in a different region, so that users can access it nearby and improve access speed.

4.2 Distributed task scheduling system (XXL-JOB is used this time)

Transcoding a video can be understood as the execution of a task. If the number of videos is relatively large ( massive tasks ), how to efficiently process a batch of tasks?
1. Multi-threading
Multi-threading is to make full use of the resources of a single machine.
2. Distributed multi-threading
makes full use of multiple computers, and each computer uses multi-threading processing.
Option 2 is more scalable.
Solution 2 is a distributed task scheduling solution.

What is task scheduling?

Task scheduling, as the name suggests, is the scheduling of tasks. It means that the system automatically executes tasks based on a given time point, a given time interval or a given number of executions in order to complete a specific business.

What is distributed task scheduling?

Usually the task scheduling program is integrated in the application. For example, the coupon service includes a scheduler for regularly issuing coupons, and the settlement service includes a task scheduler for regularly generating reports. Due to the distributed architecture, a service Multiple redundant instances are often deployed to run our business. Running task scheduling in this distributed system environment is called distributed task scheduling.
Insert image description here

advantage:

Using a distributed method of scheduling tasks is equivalent to building a task scheduler in a distributed manner, so that it can have the characteristics of a distributed system and improve task scheduling processing capabilities: 1. Parallel task scheduling
Parallel
task scheduling relies on multi-threading. If There are a large number of tasks that need to be scheduled. At this time, multi-threading alone will have a bottleneck, because the processing power of a computer CPU is limited.
If the task scheduler is deployed in a distributed manner, each node can also be deployed as a cluster, so that multiple computers can complete task scheduling together. We can divide the task into several shards and execute them in parallel by different instances. To improve the processing efficiency of task scheduling.
2. High availability
If a certain instance goes down, it will not affect other instances to perform tasks.
3. Elastic expansion
When adding instances in the cluster, the processing efficiency of executing tasks can be improved.
4. Task management and monitoring
Unified management and monitoring of all timed tasks existing in the system. Allow developers and operation and maintenance personnel to keep abreast of task execution status, so as to make rapid emergency response.
5. Avoid repeated execution of tasks.
When task scheduling is deployed in a cluster, the same task schedule may be executed multiple times. For example, in the example of issuing coupons at points in the e-commerce system mentioned above, multiple coupons will be issued. It causes a lot of losses to the company, so we need to control the same task to be executed only once on multiple running instances.

5. MinIo intensive lecture

5.1MinIo introduction

MinIO is a very lightweight service that can be easily used in combination with other applications. It is compatible with the Amazon S3 cloud storage service interface and is very suitable for storing large-capacity unstructured data, such as pictures, videos, log files, and backups. Data and container/VM images etc.
One of its major features is that it is lightweight, easy to use, powerful, supports various platforms, has a maximum single file of 5TB, is compatible with the Amazon S3 interface, and provides multi-version SDK support such as Java, Python, and GO.
Official website: https://min.io
Chinese: https://www.minio.org.cn/, http://docs.minio.org.cn/docs/

The MinIO cluster adopts a decentralized shared architecture , and each node is in a peer-to-peer relationship. MinIO can be load-balanced and accessed through Nginx.

What are the benefits of decentralization?

In the field of big data, the usual design concepts are centerless and distributed. The Minio distributed mode can help you build a highly available object storage service, and you can use these storage devices regardless of their real physical location . It combines multiple hard disks distributed on different servers to form an object storage service . Since the hard disks are distributed on different nodes, distributed Minio avoids single points of failure.
Insert image description here
Minio uses erasure code technology to protect data. It is a mathematical algorithm for recovering lost and damaged data. It stores data blocks redundantly on the disks of each node. All available disks form a set. The figure consists of 8 hard drives. When a file is uploaded, the file will be stored in blocks through the erasure code algorithm calculation. In addition to dividing the file itself into 4 data blocks, 4 check blocks will be generated. The data block and The parity blocks will be stored dispersedly on these 8 hard disks.
The advantage of using erasure coding is that even if half the number (N/2) of the hard drives are lost, the data can still be recovered . For example, if there are less than 4 hard disks damaged in the above collection, data recovery can still be guaranteed without affecting uploading and downloading. If more than half of the hard disks are damaged, it cannot be recovered.

5.2MinIo use

After installation and login, the interface looks like this, the interface is very simple, you can use it by yourself.

Insert image description here
Bucket, bucket, it is equivalent to the directory where files are stored, and several buckets can be created

5.2.1maven dependencies are as follows:

Minimum requirement Java 1.8 or higher:

<dependency>
    <groupId>io.minio</groupId>
    <artifactId>minio</artifactId>
    <version>8.4.3</version>
</dependency>
<dependency>
    <groupId>com.squareup.okhttp3</groupId>
    <artifactId>okhttp</artifactId>
    <version>4.8.1</version>
</dependency>

5.2.2 Configuration

Configure according to your own minIo information in the configuration file.
Insert image description here
The bucket name above corresponds to my bucket name below.
Insert image description here

Three parameters are required to connect to the minio service. URL Access Key of
Endpoint Object Storage Service Access key is like a user ID that uniquely identifies your account. Secret Key Secret key is your account password.

minio:
  endpoint: http://192.168.101.65:9000
  accessKey: minioadmin
  secretKey: minioadmin
  bucket:
    files: mediafiles
    videofiles: video

If it is not for testing, you can write a MinIo configuration class for easy use, such as:

package com.xuecheng.media.config;

import io.minio.MinioClient;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

/**
 * @description minio配置类
 */
@Configuration
public class MinioConfig {
    
    

    @Value("${minio.endpoint}")
    private String endpoint;
    @Value("${minio.accessKey}")
    private String accessKey;
    @Value("${minio.secretKey}")
    private String secretKey;

    @Bean
    public MinioClient minioClient() {
    
    

        MinioClient minioClient =
                MinioClient.builder()
                        .endpoint("http://192.168.101.65:9000")
                        .credentials(accessKey, secretKey)
                        .build();
        return minioClient;
    }

}

5.2.3 Breakpoint resume technology (you need to know something about uploading videos)

What is breakpoint resume upload
? Usually video files are relatively large, so the media asset system’s file upload requirements must meet the upload requirements for large files. The http protocol itself has no limit on the size of uploaded files, but the quality of the customer's network environment, computer hardware environment, etc. are uneven. If a large file is almost finished uploading and the network is disconnected but the upload is not completed, the customer needs to re-upload, and the user experience is very poor, so The most basic requirement for uploading large files is to resume uploading after a break.
What is resumable download:
Quoting from Baidu Encyclopedia: Resumable download refers to artificially dividing the download or upload task (a file or a compressed package) into several parts when downloading or uploading, and each part uses a Threads are used to upload or download. If there is a network failure, you can start from the uploaded or downloaded part and continue to upload and download the unfinished part. There is no need to start uploading and downloading from the beginning. Breakpoint resuming can save operation time and improve user experience. sex.
The process of resuming the download after a breakpoint is as follows:
Insert image description here

The process is as follows:
1. Before the front-end uploads, divide the file into blocks
. 2. Upload one by one. After the upload is interrupted, upload again. The uploaded blocks do not need to be uploaded again.
3. After uploading each block, finally merge the files on the server.

5.2.4 Sample test code for minIo

This is just a demonstration of basic usage, you still need to modify it for your own use.


import com.j256.simplemagic.ContentInfo;
import com.j256.simplemagic.ContentInfoUtil;
import io.minio.*;
import io.minio.errors.*;
import io.minio.messages.DeleteError;
import io.minio.messages.DeleteObject;
import org.apache.commons.codec.digest.DigestUtils;
import org.apache.commons.compress.utils.IOUtils;
import org.junit.jupiter.api.Test;
import org.springframework.http.MediaType;

import java.io.*;
import java.security.InvalidKeyException;
import java.security.NoSuchAlgorithmException;
import java.util.List;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class MinioTest {
    
    


    MinioClient minioClient =
            MinioClient.builder()
                    .endpoint("http://192.168.101.65:9000")
                    .credentials("minioadmin", "minioadmin")
                    .build();

    @Test
    public void test_upload() throws IOException, ServerException, InsufficientDataException, ErrorResponseException, NoSuchAlgorithmException, InvalidKeyException, InvalidResponseException, XmlParserException, InternalException {
    
    

        //根据扩展名取出媒体资源类型mimeType
        ContentInfo extensionMatch = ContentInfoUtil.findExtensionMatch(".mp4");
        String mimeType = MediaType.APPLICATION_OCTET_STREAM_VALUE;//通用mimeType,字节流

        if (extensionMatch != null) {
    
    
            String mimeType1 = extensionMatch.getMimeType();

        }

        //上传文件的参数信息
        UploadObjectArgs uploadObjectArgs = UploadObjectArgs.builder()
                .bucket("testbuket")//桶
                .filename("C:\\Users\\a2262\\Pictures\\blog\\Blue Whale.png")//指定本地文件路径
                // .object("blog.png")//对象名,文件存放路径相对于桶名
                .object("test/01/blog.png") // 在testbuket/test/01下存放文件,命名为blog.png
                // .contentType("image/png") //设置媒体文件类型
                .contentType(mimeType)
                .build();

        //上传文件
        minioClient.uploadObject(uploadObjectArgs);


    }

    @Test
    public void test_delete() throws Exception {
    
    

        //删除文件的参数信息
        RemoveObjectArgs removeObjectArgs = RemoveObjectArgs.builder()
                .bucket("testbuket")
                .object("blog.png")
                .build();

        //删除文件
        minioClient.removeObject(removeObjectArgs);

    }

    //查询文件,从minio下载
    @Test
    public void getFile() throws IOException {
    
    
        GetObjectArgs getObjectArgs = GetObjectArgs
                .builder()
                .bucket("testbuket")
                .object("test/01/blog.png")
                .build();
        try (
                //这是远程流,不稳定
                FilterInputStream inputStream = minioClient.getObject(getObjectArgs);
                FileOutputStream outputStream = new FileOutputStream(new File("D:\\MinIo\\blog.png"));
        ) {
    
    
            IOUtils.copy(inputStream, outputStream);
        } catch (Exception e) {
    
    
            e.printStackTrace();
        }
        //校验文件的完整性对文件的内容进行md5
        FileInputStream fileInputStream1
                = new FileInputStream(new File("C:\\Users\\a2262\\Pictures\\blog\\Blue Whale.png"));
        String source_md5 = DigestUtils.md5Hex(fileInputStream1);
        FileInputStream fileInputStream
                = new FileInputStream(new File("D:\\MinIo\\blog.png"));
        String local_md5 = DigestUtils.md5Hex(fileInputStream);
        if (source_md5.equals(local_md5)) {
    
    
            System.out.println("下载成功");

        }
    }

    //将分块文件上传到minio
    @Test
    public void uploadChunk() throws IOException, ServerException, InsufficientDataException, ErrorResponseException, NoSuchAlgorithmException, InvalidKeyException, InvalidResponseException, XmlParserException, InternalException {
    
    
        for (int i = 0; i < 2; i++) {
    
    
            //上传文件的参数信息
            UploadObjectArgs uploadObjectArgs = UploadObjectArgs.builder()
                    .bucket("testbuket")//桶
                    .filename("D:\\MinIo\\upload\\chunk\\" + i)//指定本地文件路径
                    .object("chunk/" + i)
                    .build();
            minioClient.uploadObject(uploadObjectArgs);
            System.out.println("上传分块" + i + "成功");
        }
    }


    //调用minio接口合并分块
    @Test
    public void uploadMerge() throws ServerException, InsufficientDataException, ErrorResponseException, IOException, NoSuchAlgorithmException, InvalidKeyException, InvalidResponseException, XmlParserException, InternalException {
    
    
        //指定分块文件信息
        List<ComposeSource> source = Stream.iterate(0, i -> ++i).limit(2).map(
                        i -> ComposeSource
                                .builder()
                                .bucket("testbuket")
                                .object("chunk/" + i)
                                .build()
                )
                .collect(Collectors.toList());

        //指定分块后文件信息
        ComposeObjectArgs testbuket = ComposeObjectArgs.builder()
                .bucket("testbuket")
                .object("merge01.mp4")
                .sources(source)
                .build();
        //合并文件
        minioClient.composeObject(testbuket);

    }


    //批量清理分块

    public void test_removeObjects() {
    
    
        //合并分块完成将分块文件清除
        List<DeleteObject> deleteObjects = Stream.iterate(0, i -> ++i)
                .limit(2)
                .map(i -> new DeleteObject("chunk/".concat(Integer.toString(i))))
                .collect(Collectors.toList());

        RemoveObjectsArgs removeObjectsArgs = RemoveObjectsArgs.builder()
                        .bucket("testbucket")
                        .objects(deleteObjects)
                        .build();
        Iterable<Result<DeleteError>> results = minioClient.removeObjects(removeObjectsArgs);
        results.forEach(r -> {
    
    
            DeleteError deleteError = null;
            try {
    
    
                deleteError = r.get();
            } catch (Exception e) {
    
    
                e.printStackTrace();
            }
        });
    }

    //分块测试
    @Test
    public void testChunk() throws IOException {
    
    

        //源文件
        File sourceFile = new File("C:\\Users\\a2262\\Videos\\视频\\3db161-170d56d7335.mp4");
        //存储文件路径
        String chunkFilePath="D:\\MinIo\\upload\\chunk\\";
        File mk_file = new File(chunkFilePath);
        if(!mk_file.exists()){
    
    
            mk_file.mkdirs();
        }
        //分块大小,默认最小5mb,否则要改源码
        int chunkSize=1024*1024*5;
        //文件分块数,向上取整
        int chunkNum= (int) Math.ceil(sourceFile.length()*1.0/chunkSize);
        //用流读数据
        RandomAccessFile r = new RandomAccessFile(sourceFile, "r");
        //缓存区
        byte[] bytes = new byte[1024];
        for (int i = 0; i < chunkNum; i++) {
    
    
            File chunkFile = new File(chunkFilePath + i);
            if(chunkFile.exists()){
    
    
                chunkFile.delete();
            }
            RandomAccessFile rw = new RandomAccessFile(chunkFile, "rw");
            int len=-1;
            while((len=r.read(bytes))!=-1){
    
    
                rw.write(bytes,0,len);
                if(chunkFile.length()>=chunkSize){
    
    
                    break;
                }
            }
            rw.close();
        }
        r.close();

    }

    //将分块合并
    @Test
    public void testMerge() throws IOException {
    
    
        //块文件目录
        File chunkFolder=new File("D:\\MinIo\\upload\\chunk");
        //源文件
        File sourceFile = new File("C:\\Users\\a2262\\Videos\\视频\\3db161-170d56d7335.mp4");
        File mk_file = new File("D:\\MinIo\\merge\\chunk");
        if(!mk_file.exists()){
    
    
            mk_file.mkdirs();
        }
        //合并后的文件
        File mergeFile = new File("D:\\MinIo\\merge\\chunk\\3db161-170d56d7335.mp4");
        //取出所有的分块文件
        File[] files = chunkFolder.listFiles();
        //将数组转化为list
        List<File> filesList = Arrays.asList(files);

        Collections.sort(filesList, new Comparator<File>() {
    
    
            @Override
            public int compare(File o1, File o2) {
    
    
                return Integer.parseInt(o1.getName())-Integer.parseInt(o2.getName());
            }
        });

        //向合并文件写的流
        RandomAccessFile rw = new RandomAccessFile(mergeFile, "rw");
        //缓冲区
        byte[] bytes =new byte[1024];

        for (File file:filesList){
    
    
            //读分块的流
            RandomAccessFile r = new RandomAccessFile(file, "r");
            int len=-1;
            while((len=r.read(bytes))!=-1){
    
    
                rw.write(bytes,0,len);
            }
            r.close();
        }
        rw.close();

        //合并文件完成进行md5校验
        FileInputStream merge_f = new FileInputStream(mergeFile);
        FileInputStream chunk_f = new FileInputStream(sourceFile);
        String s1 = DigestUtils.md5Hex(merge_f);
        String s2 = DigestUtils.md5Hex(chunk_f);
        if(s1.equals(s2)){
    
    
            System.out.println("合并完成");
        }
    }


}

6. XXL-JOB intensive lecture

6.1 Introduction

XXL-JOB is a lightweight distributed task scheduling platform. Its core design goals are rapid development, easy learning, lightweight, and easy expansion. The source code is now open and connected to the online product lines of many companies, ready to use out of the box.
Official website: https://www.xuxueli.com/xxl-job/
XXL-JOB mainly includes scheduling center, executors, and tasks:
Insert image description here
Scheduling center :
Responsible for managing scheduling information, issuing scheduling requests according to scheduling configuration, and does not bear business code itself;
The main responsibilities are executor management, task management, monitoring operation and maintenance, log management, etc.
Task executor :
responsible for receiving scheduling requests and executing task logic;
as long as the responsibilities are registration service and task execution service (after receiving the task, it will be placed in the thread pool task queue), execution result reporting, log service, etc.
Task: responsible for executing specific business processing.

The workflow between the dispatch center and the executor is as follows:
Insert image description here
Execution process:
1. The task executor automatically registers to the dispatch center according to the configured dispatch center address
2. When the task triggering conditions are met, the dispatch center issues the task
3. The executor is based on The thread pool executes the task, puts the execution results into the memory queue, and writes the execution log to the log file.
4. The executor consumes the execution results in the memory queue and actively reports them to the dispatch center.
5. When the user views the task log in the dispatch center , the dispatch center requests the task executor, and the task executor reads the task log file and returns the log details

6.2 Use

6.2.1 First download XXL-JOB

GitHub:https://github.com/xuxueli/xxl-job
码云:https://gitee.com/xuxueli0323/xxl-job

After downloading and decompressing, use IDEA to open the decompressed directory
Insert image description here
xxl-job-admin: scheduling center
xxl-job-core: public dependencies
xxl-job-executor-samples: Executor Sample (select the appropriate version of the executor, you can directly Use)
: xxl-job-executor-sample-springboot: Springboot version, manage the executor through Springboot, this method is recommended;
:xxl-job-executor-sample-frameless: frameless version;
doc: documentation, including database scripts

6.2.2 Create database script in doc


After creation, it will look like the figure below

Insert image description here

6.2.3 Access XXL-JOB

http://your ip address:8088/xxl-job-admin/

The interface looks like thisInsert image description here

6.3 Actuator

The executor is responsible for communicating with the dispatch center to receive the task scheduling request initiated by the dispatch center. Example of
configuring the executor : Click Add and fill in the executor information. appname is the application name of the executor specified when configuring the xxl information in nacos. save
Insert image description here



Insert image description here

6.4 Configuration

Maven dependency

<dependency>
    <groupId>com.xuxueli</groupId>
    <artifactId>xxl-job-core</artifactId>
    <version>2.3.1</version>
</dependency>

configuration file

xxl:
  job:
    admin: 
      addresses: http://你的ip地址:8088/xxl-job-admin 
    executor:
      appname: media-process-service #执行器名字
      address: 
      ip: 
      port: 9999
      logpath: /data/applogs/xxl-job/jobhandler
      logretentiondays: 30
    accessToken: default_token

Note that the appname in the configuration is the application name of the executor, and the port is the port on which the executor is started. If multiple executors are started locally, the ports cannot be repeated.

Configuration class

import com.xxl.job.core.executor.impl.XxlJobSpringExecutor;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

/**
 * xxl-job config
 */
@Configuration
public class XxlJobConfig {
    
    
    private Logger logger = LoggerFactory.getLogger(XxlJobConfig.class);

    @Value("${xxl.job.admin.addresses}")
    private String adminAddresses;

    @Value("${xxl.job.accessToken}")
    private String accessToken;

    @Value("${xxl.job.executor.appname}")
    private String appname;

    @Value("${xxl.job.executor.address}")
    private String address;

    @Value("${xxl.job.executor.ip}")
    private String ip;

    @Value("${xxl.job.executor.port}")
    private int port;

    @Value("${xxl.job.executor.logpath}")
    private String logPath;

    @Value("${xxl.job.executor.logretentiondays}")
    private int logRetentionDays;


    @Bean
    public XxlJobSpringExecutor xxlJobExecutor() {
    
    
        logger.info(">>>>>>>>>>> xxl-job config init.");
        XxlJobSpringExecutor xxlJobSpringExecutor = new XxlJobSpringExecutor();
        xxlJobSpringExecutor.setAdminAddresses(adminAddresses);
        xxlJobSpringExecutor.setAppname(appname);
        xxlJobSpringExecutor.setAddress(address);
        xxlJobSpringExecutor.setIp(ip);
        xxlJobSpringExecutor.setPort(port);
        xxlJobSpringExecutor.setAccessToken(accessToken);
        xxlJobSpringExecutor.setLogPath(logPath);
        xxlJobSpringExecutor.setLogRetentionDays(logRetentionDays);

        return xxlJobSpringExecutor;
    }

    /**
     * 针对多网卡、容器内部署等情况,可借助 "spring-cloud-commons" 提供的 "InetUtils" 组件灵活定制注册IP;
     *
     *      1、引入依赖:
     *          <dependency>
     *             <groupId>org.springframework.cloud</groupId>
     *             <artifactId>spring-cloud-commons</artifactId>
     *             <version>${version}</version>
     *         </dependency>
     *
     *      2、配置文件,或者容器启动变量
     *          spring.cloud.inetutils.preferred-networks: 'xxx.xxx.xxx.'
     *
     *      3、获取IP
     *          String ip_ = inetUtils.findFirstNonLoopbackHostInfo().getIpAddress();
     */


}

After restarting the project, check that
Insert image description here1 actuator has been displayed at the online machine address on the XXL-Job interface. Or observe the log after startup, if the following log appears,
Insert image description here
it means that the executor has successfully registered in the dispatch center

6.5 Perform tasks

Reference sample code
Insert image description here
such as:

Java
import com.xxl.job.core.context.XxlJobHelper;
import com.xxl.job.core.handler.annotation.XxlJob;
import lombok.extern.slf4j.Slf4j;
import org.springframework.stereotype.Component;

import java.util.concurrent.TimeUnit;

/**
 * @description 测试执行器
 * @author Mr.M
 * @date 2022/9/13 20:32
 * @version 1.0
 */
 @Component
 @Slf4j
public class SampleJob {
    
    

 /**
  * 1、简单任务示例(Bean模式)
  */
 @XxlJob("testJob")
 public void testJob() throws Exception {
    
    
  log.info("开始执行.....");

 }

}

Next, add a task in the dispatch center and enter task management
Insert image description here
. Click Add and fill in the task information.
Insert image description here
Scheduling type :
fixed speed refers to scheduled scheduling at fixed intervals.
Cron implements richer timing scheduling strategies through Cron expressions.
The Cron expression is a string through which the scheduling policy can be defined. The format is as follows:
{seconds} {minutes} {hours} {date} {month} {week} {year (can be empty)}
xxl-job provides graphics Interface to configure:

Insert image description here
Some examples are:
30 10 1 * * ? Trigger every day at 1:10:30 0/30
* * * * ? Trigger every 30 seconds

  • 0/10 * * * ? Trigger every 10 minutes

Operation mode :
There are BEAN and GLUE. The bean mode is more commonly used to write the task code of the executor in the project project. GLUE writes the task code in the dispatch center.

JobHandler :
Name of the task method, fill in the name in the @XxlJob annotation above the task method.
Insert image description here

Routing strategy :
When the executor cluster is deployed, to which executor will the dispatch center deliver the task? Selecting the first one here means only delivering the task to the first executor. Other options of the routing strategy will be detailed later in the fragment broadcast chapter. explain.
Other configuration items of advanced configuration will be explained in detail in the fragment broadcast chapter later.

Added successfully, started the task

To stop the task, you need to perform operations in the schedule,
Insert image description here
and you can also perform log clearing
Insert image description here

6.6 Routing strategy for fragmented broadcast

6.6.1 What are the routing strategies?

Advanced configuration:
- Routing strategy: When the executor cluster is deployed, a rich routing strategy is provided, including;
FIRST (first): fixed selection of the first machine;
LAST (last): fixed selection of the last machine;
ROUND ( Polling):
RANDOM (random): randomly select online machines;
CONSISTENT_HASH (consistency HASH): each task selects a certain machine according to the Hash algorithm, and all tasks are evenly hashed on different machines.
LEAST_FREQUENTLY_USED (least frequently used): the machine with the lowest frequency of use is elected first;
LEAST_RECENTLY_USED (the machine that has not been used for the most recent time): the machine that has not been used for the longest time is elected
first ; The machine with successful heartbeat detection is selected as the target executor and initiates scheduling;
BUSYOVER (busy transfer): the idle detection is performed sequentially, and the first machine with successful idle detection is selected as the target executor and initiates scheduling;
SHARDING_BROADCAST (sharding Broadcast) : Broadcast triggers all machines in the corresponding cluster to execute a task, and the system automatically transmits fragmentation parameters; fragmentation tasks can be developed according to fragmentation parameters;

- 子任务:每个任务都拥有一个唯一的任务ID(任务ID可以从任务列表获取),当本任务执行结束并且执行成功时,将会触发子任务ID所对应的任务的一次主动调度,通过子任务可以实现一个任务执行完成去执行另一个任务。
- 调度过期策略:
    - 忽略:调度过期后,忽略过期的任务,从当前时间开始重新计算下次触发时间;
    - 立即执行一次:调度过期后,立即执行一次,并从当前时间开始重新计算下次触发时间;
- 阻塞处理策略:调度过于密集执行器来不及处理时的处理策略;
    单机串行(默认):调度请求进入单机执行器后,调度请求进入FIFO队列并以串行方式运行;
    丢弃后续调度:调度请求进入单机执行器后,发现执行器存在运行的调度任务,本次请求将会被丢弃并标记为失败;
    覆盖之前调度:调度请求进入单机执行器后,发现执行器存在运行的调度任务,将会终止运行中的调度任务并清空队列,然后运行本地调度任务;
- 任务超时时间:支持自定义任务超时时间,任务运行超时将会主动中断任务;
- 失败重试次数;支持自定义任务失败重试次数,当任务失败时将会按照预设的失败重试次数主动进行重试;

6.6.2 Focus on the fragmented broadcast strategy

The key point is the shard broadcast strategy . Sharding means that the scheduling center performs sharding based on the executor dimension , and marks the executors in the cluster with serial numbers: 0, 1, 2, 3..., and broadcast refers to each scheduling. Task scheduling will be sent to all executors in the cluster. The request carries the sharding parameters , and the sharding parameters are obtained for sharding business processing .

Each executor receives the scheduling request and also receives the sharding parameters.
xxl-job supports dynamic expansion of the executor cluster to dynamically increase the number of shards. When the workload increases, more executors can be deployed to the cluster, and the dispatch center will dynamically modify the number of shards.

What scenarios are job sharding suitable for?
• Sharding task scenario: A cluster of 10 executors processes 100,000 pieces of data. Each machine only needs to process 10,000 pieces of data, which reduces the time consumption by 10 times; • Broadcast task scenario: The broadcast executor runs the shell script and broadcasts the
cluster node at the same time. Perform cache updates, etc.
Therefore, the broadcast sharding method can not only give full play to the capabilities of each executor, but also control whether the task is executed according to the sharding parameters, and ultimately flexibly control the distributed processing tasks of the executor cluster.

6.6.3 Fragment broadcast usage example

code

Java
/**
  * 2、分片广播任务
  */
 @XxlJob("shardingJobHandler")
 public void shardingJobHandler() throws Exception {
    
    

  // 分片参数
  int shardIndex = XxlJobHelper.getShardIndex();
  int shardTotal = XxlJobHelper.getShardTotal();

log.info("分片参数:当前分片序号 = {}, 总分片数 = {}", shardIndex, shardTotal);
log.info("开始执行第"+shardIndex+"批任务");

 }

Add a task in the dispatch center.
Insert image description here
Add successfully, then start the task:
Insert image description here
observe the log
Insert image description here

6.7 How to ensure that multiple executors will not query duplicate tasks?

Each executor receives broadcast tasks with two parameters: the total number of fragments and the sequence number of fragments. When each task is fetched from the data table, the task id can be modulo the total number of fragments. If it is equal to the serial number of the fragment, this task will be executed.
For the above two executor instances, the total number of shards is 2, and the sequence numbers are 0 and 1, starting from task 1, as follows:
1 % 2 = 1 Executor 2 executes
2 % 2 = 0 Executor 1 executes
3 % 2 = 1 executes Server 2 executes
and so on.

6.8 How to ensure that tasks are not repeated

The job sharding scheme ensures that non-duplicated tasks are queried between executors. If an executor has not completed processing a video, and the dispatch center requests scheduling again, what should we do in order not to process the same video repeatedly?

First configure the scheduling expiration policy:
View the document as follows:
-Scheduling expiration policy : Compensation processing strategy for the scheduling center 's missed scheduling time, including: ignore, immediate compensation triggering once, etc.; -Ignore
: After the schedule expires, ignore the expired tasks and start from the current time Start recalculating the next trigger time; -Execute once immediately : after the schedule expires, execute it once immediately, and recalculate the next trigger time from the current time; -Blocking processing strategy : The processing strategy when the schedule is too dense and the executor has no time to process; Here we choose to ignore it. If you execute it once immediately, you may perform the same task repeatedly.



Insert image description here

Can only these configurations ensure that tasks will not be executed repeatedly?

If this is not possible , we still need to ensure the idempotence of task processing.

What is task idempotency?

The idempotency of the task means that no matter how many times the data is operated, the result of the operation is always consistent . What is to be achieved in this project is that no matter how many times the same video is scheduled, only one successful transcoding is performed.

What is idempotency?

It describes that one and multiple requests for a resource should have the same results for the resource itself.
Idempotence is to solve the problem of repeated submissions, such as malicious brushing, repeated payments, etc.

Commonly used solutions to idempotency:

1) Database constraints , such as: unique index, primary key.
2) Optimistic locking is often used in databases. When updating data, it is updated according to the optimistic lock status.
3) Unique serial number . The operation passes a unique serial number. When the operation is judged to be equal to the serial number, it will be executed.

The above method belongs to the use of distributed locks

What is a distributed lock?

It is not similar to the method of synchronized synchronization lock. Synchronized can only ensure that multiple threads in the same virtual machine compete for the lock. Virtual machines all seize the same lock. The lock is a separate program that provides locking and unlocking services. The lock no longer belongs to a certain virtual machine, but is deployed in a distributed manner and shared by multiple virtual machines. This lock is called a distributed lock.
Insert image description here

7. Supplement the use of FFmpeg

Download: FFmpeg https://www.ffmpeg.org/download.html#build-windows

Please find ffmpeg.exe from the common tool software directory and add ffmpeg.exe to the environment variable path.
Test whether it is normal: cmd runs ffmpeg -version

7.1 Sample code

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

public class Mp4VideoUtil extends VideoUtil {
    
    

    String ffmpeg_path = "D:\\Program Files\\ffmpeg-20180227-fa0c9d6-win64-static\\bin\\ffmpeg.exe";//ffmpeg的安装位置
    String video_path = "D:\\BaiduNetdiskDownload\\test1.avi";
    String mp4_name = "test1.mp4";
    String mp4folder_path = "D:/BaiduNetdiskDownload/Movies/test1/";
    public Mp4VideoUtil(String ffmpeg_path, String video_path, String mp4_name, String mp4folder_path){
    
    
        super(ffmpeg_path);
        this.ffmpeg_path = ffmpeg_path;
        this.video_path = video_path;
        this.mp4_name = mp4_name;
        this.mp4folder_path = mp4folder_path;
    }
    //清除已生成的mp4
    private void clear_mp4(String mp4_path){
    
    
        //删除原来已经生成的m3u8及ts文件
        File mp4File = new File(mp4_path);
        if(mp4File.exists() && mp4File.isFile()){
    
    
            mp4File.delete();
        }
    }
    /**
     * 视频编码,生成mp4文件
     * @return 成功返回success,失败返回控制台日志
     */
    public String generateMp4(){
    
    
        //清除已生成的mp4
//        clear_mp4(mp4folder_path+mp4_name);
        clear_mp4(mp4folder_path);
        /*
        ffmpeg.exe -i  lucene.avi -c:v libx264 -s 1280x720 -pix_fmt yuv420p -b:a 63k -b:v 753k -r 18 .\lucene.mp4
         */
        List<String> commend = new ArrayList<String>();
        //commend.add("D:\\Program Files\\ffmpeg-20180227-fa0c9d6-win64-static\\bin\\ffmpeg.exe");
        commend.add(ffmpeg_path);
        commend.add("-i");
//        commend.add("D:\\BaiduNetdiskDownload\\test1.avi");
        commend.add(video_path);
        commend.add("-c:v");
        commend.add("libx264");
        commend.add("-y");//覆盖输出文件
        commend.add("-s");
        commend.add("1280x720");
        commend.add("-pix_fmt");
        commend.add("yuv420p");
        commend.add("-b:a");
        commend.add("63k");
        commend.add("-b:v");
        commend.add("753k");
        commend.add("-r");
        commend.add("18");
//        commend.add(mp4folder_path  + mp4_name );
        commend.add(mp4folder_path  );
        String outstring = null;
        try {
    
    
            ProcessBuilder builder = new ProcessBuilder();
            builder.command(commend);
            //将标准输入流和错误输入流合并,通过标准输入流程读取信息
            builder.redirectErrorStream(true);
            Process p = builder.start();
            outstring = waitFor(p);

        } catch (Exception ex) {
    
    

            ex.printStackTrace();

        }
//        Boolean check_video_time = this.check_video_time(video_path, mp4folder_path + mp4_name);
        Boolean check_video_time = this.check_video_time(video_path, mp4folder_path);
        if(!check_video_time){
    
    
            return outstring;
        }else{
    
    
            return "success";
        }
    }

    public static void main(String[] args) throws IOException {
    
    
        //ffmpeg的路径
        String ffmpeg_path = "D:\\Baidu Netdisk\\item1\\day1\\常用软件\\ffmpeg\\ffmpeg.exe";//ffmpeg的安装位置
        //源avi视频的路径
        String video_path = "D:\\develop\\bigfile_test\\nacos01.avi";
        //转换后mp4文件的名称
        String mp4_name = "nacos01.mp4";
        //转换后mp4文件的路径
        String mp4_path = "D:\\MinIo\\nacos01.mp4";
        //创建工具类对象
        Mp4VideoUtil videoUtil = new Mp4VideoUtil(ffmpeg_path,video_path,mp4_name,mp4_path);
        //开始视频转换,成功将返回success
        String s = videoUtil.generateMp4();
        System.out.println(s);
    }
}

Guess you like

Origin blog.csdn.net/m0_71106830/article/details/131794583