[OpenYurt in-depth analysis] Elegant realization of edge gateway caching capabilities

Head picture.png

Author | He Linbo (Xinsheng)
Source | Alibaba Cloud Native Official Account

OpenYurt: Extend the capabilities of native K8s to the edge

One year after the launch of Alibaba Cloud Edge Container Service, OpenYurt, a cloud native edge computing solution, was officially open sourced. The difference from other open source containerized edge computing solutions is that OpenYurt adheres to the concept of Extending your native Kubernetes to edge, and is zero in the Kubernetes system. Modify and provide one-click conversion of native Kubernetes to OpenYurt, so that native K8s clusters have edge cluster capabilities.

At the same time, as OpenYurt continues to evolve, it will continue to maintain the following development concepts:

  • Non-invasive enhancement K8s
  • Keep evolving with the mainstream technology of the cloud native community

How OpenYurt solves the problem of edge autonomy

If you want to extend the Kubernetes system to the edge computing scenario, the edge nodes will be connected through the public network and the cloud. The network connection has a lot of uncontrollable factors, which may cause unstable factors in the operation of edge businesses. This is cloud native and edge computing One of the main difficulties of integration.

To solve this problem, it is necessary to make the edge side have the ability of autonomy, that is, when the cloud edge network is disconnected or the connection is unstable, to ensure that the edge business can continue to run. In OpenYurt, this capability is provided by the yurt-controller-manager and YurtHub components.

1. YurtHub architecture

In the previous article, we introduced the capabilities of YurtHub components in detail . The architecture diagram is as follows:

1.png

image link

YurtHub is a "transparent gateway" with data caching function. If the node or component restarts in the disconnected state of the cloud network, each component (kubelet/kube-proxy, etc.) will obtain business container-related data from YurtHub. Solve the problem of marginal autonomy. This also means that we need to implement a lightweight reverse proxy with data caching capabilities.

2. First thoughts

To implement a reverse proxy for caching data, the first idea is to read data from response.Body, and then return it to the requesting client and the local Cache module respectively. The pseudo code is as follows:

func HandleResponse(rw http.ResponseWriter, resp *http.Response) {
        bodyBytes, _ := ioutil.ReadAll(resp.Body)
        go func() {
                // cache response on local disk
                cacher.Write(bodyBytes)
        }

        // client reads data from response
        rw.Write(bodyBytes)
}

After in-depth thinking, in the Kubernetes system, the above implementation will cause the following problems:

  • Question 1: How to deal with streaming data (such as watch request in K8s), which means that ioutil.ReadAll() cannot return all data in one call. That is, how can the stream data be returned while buffering the stream data.

  • Question 2: At the same time, before caching data locally, it may be necessary to clean the incoming byte slice data first. This means that the byte slice needs to be modified, or the byte slice needs to be backed up before processing. This will cause a large amount of memory consumption, and at the same time, for streaming data, it is not easy to handle how big a slice is applied.

3. Discussion on elegant realization

In response to the above problems, we abstract the problems one by one, and we can find more elegant implementation methods.

  • Question 1: How to read and write stream data at the same time

For the reading and writing of streaming data (caching while returning), as shown in the figure below, all that is needed is to convert response.Body(io.Reader) into an io.Reader and an io.Writer. In other words, an io.Reader and io.Writer are combined into an io.Reader. It is easy to think of the Tee command in Linux.

2.png

In Golang, the Tee command is implemented as io.TeeReader. The pseudo code for question 1 is as follows:

func HandleResponse(rw http.ResponseWriter, resp *http.Response) {
        // create TeeReader with response.Body and cacher
        newRespBody := io.TeeReader(resp.Body, cacher)

        // client reads data from response
        io.Copy(rw, newRespBody)
}

Through the integration of Response.Body and Cacher of TeeReader, when requesting the client to read data from response.Body, it will write the return data to the Cache at the same time, which elegantly solves the processing of streaming data.

  • Question 2: How to clean stream data before caching

As shown in the figure below, the stream data is cleaned before caching. The requester and the filter need to read response.Body at the same time (2 reading problem). That is, the response.Body(io.Reader) needs to be converted into two io.Readers.

3.png

It also means that question 2 is transformed into: the io.Writer of the cache side in question 1 is converted to the io.Reader of the Data Filter. In fact, similar commands can be found in Linux commands, which are pipes. Therefore, the pseudo code for question 2 is as follows:

func HandleResponse(rw http.ResponseWriter, resp *http.Response) {
        pr, pw := io.Pipe()
        // create TeeReader with response.Body and Pipe writer
        newRespBody := io.TeeReader(resp.Body, pw)
        go func() {
                // filter reads data from response 
                io.Copy(dataFilter, pr)
        }

        // client reads data from response
        io.Copy(rw, newRespBody)
}

Through io.TeeReader and io.PiPe, when the client is requested to read data from response.Body, Filter will read the data from Response at the same time, which elegantly solves the problem of reading streaming data twice.

YurtHub implementation

Finally, take a look at the related implementation in YurtHub. Since Response.Body is io.ReadCloser, dualReadCloser is implemented. At the same time, YurtHub may also face the caching of http.Request, so the isRespBody parameter is added to determine whether it needs to be responsible for closing response.Body.

// https://github.com/openyurtio/openyurt/blob/master/pkg/yurthub/util/util.go#L156
// NewDualReadCloser create an dualReadCloser object
func NewDualReadCloser(rc io.ReadCloser, isRespBody bool) (io.ReadCloser, io.ReadCloser) {
    pr, pw := io.Pipe()
    dr := &dualReadCloser{
        rc:         rc,
        pw:         pw,
        isRespBody: isRespBody,
    }

    return dr, pr
}

type dualReadCloser struct {
    rc io.ReadCloser
    pw *io.PipeWriter
    // isRespBody shows rc(is.ReadCloser) is a response.Body
    // or not(maybe a request.Body). if it is true(it's a response.Body),
    // we should close the response body in Close func, else not,
    // it(request body) will be closed by http request caller
    isRespBody bool
}

// Read read data into p and write into pipe
func (dr *dualReadCloser) Read(p []byte) (n int, err error) {
    n, err = dr.rc.Read(p)
    if n > 0 {
        if n, err := dr.pw.Write(p[:n]); err != nil {
            klog.Errorf("dualReader: failed to write %v", err)
            return n, err
        }
    }

    return
}

// Close close two readers
func (dr *dualReadCloser) Close() error {
    errs := make([]error, 0)
    if dr.isRespBody {
        if err := dr.rc.Close(); err != nil {
            errs = append(errs, err)
        }
    }

    if err := dr.pw.Close(); err != nil {
        errs = append(errs, err)
    }

    if len(errs) != 0 {
        return fmt.Errorf("failed to close dualReader, %v", errs)
    }

    return nil
}

When using dualReadCloser, you can see it in the modifyResponse() method of httputil.NewSingleHostReverseProxy. code show as below:

// https://github.com/openyurtio/openyurt/blob/master/pkg/yurthub/proxy/remote/remote.go#L85
func (rp *RemoteProxy) modifyResponse(resp *http.Response) error {rambohe-ch, 10 months ago: • hello openyurt
            // 省略部分前置检查                                                          
            rc, prc := util.NewDualReadCloser(resp.Body, true)
            go func(ctx context.Context, prc io.ReadCloser, stopCh <-chan struct{}) {
                err := rp.cacheMgr.CacheResponse(ctx, prc, stopCh)
                if err != nil && err != io.EOF && err != context.Canceled {
                    klog.Errorf("%s response cache ended with error, %v", util.ReqString(req), err)
                }
            }(ctx, prc, rp.stopCh)

            resp.Body = rc
}

to sum up

After OpenYurt entered the CNCF sandbox in September 2020, it has continued to maintain rapid development and iteration. With the joint efforts of community students, the current open source capabilities include:

  • Marginal autonomy
  • Edge unit management
  • Cloud-side collaborative operation and maintenance
  • One-click seamless conversion capability

At the same time, after full discussion with the community students, the OpenYurt community also released the 2021 roadmap. Interested students are welcome to contribute together.
If you are interested in OpenYurt, please scan the QR code to join our community exchange group, and visit OpenYurt official website and GitHub project address:

Guess you like

Origin blog.51cto.com/13778063/2678454