OpenSSL practice based on wasm

The previous article shared the concept and basic use of WebAssembly, and gained a general understanding of WebAssembly through the analysis of two code examples. This article shares the practice of encryption tools based on WebAssembly. Let's take the digest algorithms md5 and sha1 of openssl as examples to compile openSSL to WebAssembly on the Mac.

surroundings

  • Emscripten version 2.0.3
  • Openssl version 1.1.1d
  • Browser version 85.0.4183.121 (official version) (64 bit)

Overview

  • Compile openSSL to WebAssembly on Mac
  • Problems encountered
  • to sum up

1. Compile openSSL to WebAssembly on Mac

The whole process of compiling Openssl to WebAssembly is like this, md5.c file -> emscripten compilation ->.wasm file -> combined with WebAssembly JS API -> running in the browser.

1. md5.c file
//md5.c
#include <emscripten.h>
#include <openssl/md5.h>
#include <openssl/sha.h>
#include <string.h>
#include <stdio.h>

EMSCRIPTEN_KEEPALIVE
void md5(char *str, char *result,int strlen) {
    MD5_CTX md5_ctx;
    int MD5_BYTES = 16;
    unsigned char md5sum[MD5_BYTES];
    MD5_Init(&md5_ctx);  
    MD5_Update(&md5_ctx, str,strlen);
    MD5_Final(md5sum, &md5_ctx);
    char temp[3] = {0};
    memset(result,0, sizeof(char) * 32);
    for (int i = 0; i < MD5_BYTES; i++) {
        sprintf(temp, "%02x", md5sum[i]);
        strcat(result, temp);
    }
    result[32] = '\0';
}

EMSCRIPTEN_KEEPALIVE
void sha1(char *str, char result[],int strlen) {
    unsigned char digest[SHA_DIGEST_LENGTH];
    SHA_CTX ctx;
    SHA1_Init(&ctx);
    SHA1_Update(&ctx, str, strlen);
    SHA1_Final(digest, &ctx);
    for (int i = 0; i < SHA_DIGEST_LENGTH; i++){
        sprintf(&result[i*2], "%02x", (unsigned int)digest[i]);
    }
}

The md5.c file contains two functions md5 and sha1, which will be used to compile to wasm later.

Tips: 
1. 默认情况下,Emscripten 生成的代码只会调用 main() 函数,其它的函数将被视为无用代码。在一个函数名之前添加 EMSCRIPTEN_KEEPALIVE 能够防止这样的事情发生。你需要导入 emscripten.h 库来使用 EMSCRIPTEN_KEEPALIVE。
2. 内部实现调用的是openssl提供的函数,简单封装下直接调用即可。
2. Emscripten compilation
Download openssl and generate Makefile

The version of openssl I use is 1.1.1d, address: https://github.com/openssl/openssl/releases/tag/OpenSSL_1_1_1d
After unzipping, enter the openssl-OpenSSL_1_1_1d folder. Compile and generate Makefile.

emcmake ./Configure  darwin64-x86_64-cc -no-asm --api=1.1.0

Modify the generated Makefile. If you do not modify it, compilation errors are likely to occur.

  • Change CROSS_COMPILE=/usr/local/Cellar/emscripten/1.38.44/libexec/em to CROSS_COMPILE=
  • Change CNF_CFLAGS=-arch x86_64 to CNF_CFLAGS=
Compile openssl
emmake make -j 12 build_generated libssl.a libcrypto.a
mkdir -p ~/resource/openssl/libs
cp -R include ~/resource/openssl/include
cp libcrypto.a libssl.a ~/Downloads/openssl/libs

Created an openssl directory, in fact, to reference the location of the static library in md5.c. After the compilation is successful, two files, libssl.a and libcrypto.a, will appear in the folder.

Compile wasm
emcc md5.c -I ~/resource/openssl/include -L ~/resource/openssl/libs -lcrypto -s EXTRA_EXPORTED_RUNTIME_METHODS='["cwrap", "ccall"]' -o md5.js

After successful compilation, two files md5.js and md5.wasm will be generated.

Tips: 
Emscripten从v1.38开始,ccall/cwrap辅助函数默认没有导出,在编译时需要通过-s "EXTRA_EXPORTED_RUNTIME_METHODS=['ccall', 'cwrap']"选项显式导出。
3. Call the wasm file

Use WebAssembly JS API to call wasm. Both md5 and sha1 codes are placed in md5.html, and they are used in the same way. Only md5 related codes are posted in the text. Code address: https://github.com/likai1130/study/blob/master/wasm/openssl/demo/md5.html

<div>
    <div>
        <input type="file" id="md5files" style="display: none" onchange="md5fileImport();">计算md5
        <input type="button" id="md5fileImport" value="导入">
    </div>
</div>

<script src="jquery-3.5.1.min.js"></script>
<script src="md5.js"></script>
<script type='text/javascript'>
    Module = {};
    const mallocByteBuffer = len => {
        const ptr = Module._malloc(len)
        const heapBytes = new Uint8Array(Module.HEAPU8.buffer, ptr, len)
        return heapBytes
    }
    //点击导入按钮,使files触发点击事件,然后完成读取文件的操作
    $("#md5fileImport").click(function() {
        $("#md5files").click();
    })
    function md5fileImport() {
        //获取读取我文件的File对象
        var selectedFile = document.getElementById('md5files').files[0];
        var name = selectedFile.name; //读取选中文件的文件名
        var size = selectedFile.size; //读取选中文件的大小
        console.log("文件名:" + name + "大小:" + size);
        var reader = new FileReader(); //读取操作就是由它完成.
        reader.readAsArrayBuffer(selectedFile)
        reader.onload = function() {
            //当读取完成后回调这个函数,然后此时文件的内容存储到了result中,直接操作即可
            console.log(reader.result);
            const md5 = Module.cwrap('md5', null, ['number', 'number'])                 const inBuffer = mallocByteBuffer(reader.result.byteLength)
            var ctx = new Uint8Array(reader.result)                 inBuffer.set(ctx)
            const outBuffer = mallocByteBuffer(32)
            md5(inBuffer.byteOffset,outBuffer.byteOffset,inBuffer.byteLength)
            console.log("md5值= ",Array.from(outBuffer).map(v => String.fromCharCode(v)).join(''))
            Module._free(inBuffer);
            Module._free(outBuffer);
        }
    }
</script>
4. Run in the browser

File a.out is a binary data
md5: 0d3c57ec65e81c7ff6da72472c68d95b
sha1: 9ef00799a4472c71f2177fd7254faaaadedb0807

Insert picture description here
Insert picture description here
One is md5 and sha1 calculated by the program, and the other is md5 and sha1 calculated by openssl on the system, indicating that the practice of compiling openssl by Webassembly is successful.

2. Problems encountered

The call chain is as follows:

md5.js (胶水代码)<-----> md5.c <-----> openssl API
Data communication problem

In the entire practice process, the most troublesome problem is the data communication problem. It is very troublesome to transfer complex data structures between C/C++ and JS, and it needs to operate memory to achieve.

  • Javascript and C/C++ exchange data

    typescript
    #md5.wasm解析后的md5函数在wasm文件中的代码
    func $md5 (;3;) (export "md5") (param $var0 i32) (param $var1 i32) (param $var2 i32)

    Because wasm can currently only import and export C language function style APIs, and the parameters have only four data types (i32, i64, f32, f64), all of which are numbers, which can be understood as naked binary codes, and it is impossible to directly pass complex ones. Type and data structure. Therefore, in the browser, these high-level APIs must be encapsulated by JS, and a mechanism is needed to implement cross-language conversion of complex data structures.

    • Module.buffer

      Regardless of whether the compilation target is asm.js or wasm, the memory space in the eyes of C/C++ code actually corresponds to the ArrayBuffer object provided by Emscripten: Module.buffer, and the C/C memory address corresponds to the module.buffer array subscript one by one.

      function md5fileImport() {
         var selectedFile =   document.getElementById('md5files').files[0];
         var name = selectedFile.name; //读取选中文件的文件名
         var size = selectedFile.size; //读取选中文件的大小
         console.log("文件名:" + name + "大小:" + size);
         var reader = new FileReader(); //这是核心,读取操作就是由它完成.
      
         reader.readAsArrayBuffer(selectedFile)
         .....
      }

      In the code, we use reader.readAsArrayBuffer() to read the file and return an ArrayBuffer array. But still cannot call the C function, you need to create a typed array, such as Int8Array, UInt32Array, and use its specific format as the view of this piece of binary data to perform read and write operations.

      Tips:
          C/C++代码能直接通过地址访问的数据全部在内存中(包括运行时堆、运行时栈),而内存对应Module.buffer对象,C/C代码能直接访问的数据事实上被限制在Module.buffer内部。

      The memory of WebAssembly is also an ArrayBuffer. The Module encapsulated by Emscripten provides various views such as Module.HEAP8 and Module.HEAPU8. Attached:
      Insert picture description here

  • Access C/C++ memory in JavaScript

Calculating md5/sha1 requires javascript to input a large amount of data into the C/C++ environment, and C/C++ cannot predict the size of the data block. At this time, you can allocate memory in JavaScript and load the data, then pass in the data pointer and call the C function To process.

Tips:
这种用法之所以可行,核心原因在于:Emscripten导出了C的malloc()/free()

I declared the method of allocating memory space as a public method.

        Module = {};
        const mallocByteBuffer = len => {
            const ptr = Module._malloc(len)
            const heapBytes = new Uint8Array(Module.HEAPU8.buffer, ptr, len)
            return heapBytes
        }

        function md5fileImport() {
            //获取读取我文件的File对象
            var selectedFile = document.getElementById('md5files').files[0];
            ......
            var reader = new FileReader(); //这是核心,读取操作就是由它完成.
            reader.readAsArrayBuffer(selectedFile)
            reader.onload = function() {
                //当读取完成后回调这个函数,然后此时文件的内容存储到了result中,直接操作即可
                const md5 = Module.cwrap('md5', null, ['number', 'number'])
                const inBuffer = mallocByteBuffer(reader.result.byteLength)
                var ctx = new Uint8Array(reader.result)
                inBuffer.set(ctx)
                const outBuffer = mallocByteBuffer(32)
                md5(inBuffer.byteOffset,outBuffer.byteOffset,inBuffer.byteLength)

                console.log("md5值= ",Array.from(outBuffer).map(v => String.fromCharCode(v)).join(''))
                Module._free(inBuffer);
                Module._free(outBuffer);
            }
        }
Tips: 
C/C++的内存没有gc机制,在JavaScript中使用malloc()函数分配的内存使用结束后,需要使用free()将其释放。

In addition, Emscripten also provides a series of auxiliary functions such as AsciiToString()/stringToAscii()/UTF8ArrayToString()/stringToUTF8Array() to handle the conversion of strings in various formats in various storage objects. For details, please refer to yourself Glue code.

Three, summary

The complete call relationship of openssl based on wasm:

Insert picture description here

The technical problem encountered in this practice is the problem of data communication. Another problem is the idea. I always thought that the whole openssl compiled into a .wasm file can be used. It turns out that you need to use glue code to be able to use it. Used in the web. So there is a question. Wasm file is essentially a binary file. Is there a tool that can be run directly. Wasm file, WAPM (WebAssembly Package Manager) This is a package management tool for WebAssembly, let’s get to know the WebAssembly package management tool in the next article .

Reference


Netwarps is composed of a senior cloud computing and distributed technology development team in China. The team has very rich experience in the financial, power, communications and Internet industries. Netwarps currently has R&D centers in Shenzhen and Beijing, with a team size of 30+, most of which are technicians with more than ten years of development experience, from professional fields such as the Internet, finance, cloud computing, blockchain, and scientific research institutions.
Netwarps focuses on the development and application of secure storage technology products. The main products include decentralized file system (DFS) and decentralized computing platform (DCP), and are committed to providing distributed storage and distributed based on decentralized network technology. The computing platform has the technical characteristics of high availability, low power consumption and low network, and is suitable for scenarios such as the Internet of Things and Industrial Internet.
Official account: Netwarps

Guess you like

Origin blog.51cto.com/14915984/2561738