Use iconv turn gb2312 utf8 encoded and transcoded failure to solve

Use Background

Used in the project thrift were C # program calls c ++ interface, the protocol is transmitted through json, because the default thrift use utf8 for transmission, while C # and c ++ programs use multi-byte encoding default, so before the transfer is required utf8 encoded conversion, and when the reception process is converted to gb2312.

problem

bug occurs when a file path above, contains the file path will lead c ++ client can not resolve, but pure Chinese and English and the different characters, no problems, so the start is not suspected coding problem, after commissioning to finalize the issue in iconv transcoding , when transcoding conversion fails, resulting in the return result is empty.

analysis

File named "1 Luanma ⑷} · 々.mp4" which contains special characters and character, guess the character set can not be represented transcoding cause failure.

solve

Online inquiry existence of this problem, we recommend encoding gb2312 replaced gb18030 to support more characters.

The original transcoding function
STD :: :: String ConvertCode gbk2utf8 (const String & strGbk :: STD)
{
    return code_convert ( "GB2312", "UTF-. 8", strGbk);
}

After normal shift tests
STD :: :: String ConvertCode gbk2utf8 (const String & strGbk :: STD)
{
    return code_convert ( "GB18030", "UTF-. 8", strGbk);
}

Iconv conversion function attached
STD :: :: String ConvertCode code_convert (source_charset char *, char * to_charset, const String & sourceStr :: STD)
{
    iconv_t CD = Iconv_open (to_charset, source_charset); // Handle to obtain the conversion, void * type
    if ( 0 == CD)
        return "";

    size_t inlen = sourceStr.size();

    if (inlen == 0)
        return "";

    = 2 * inlen for outlen size_t +. 1;
    const char * INBUF = (char *) sourceStr.c_str ();
    char * outbuf, = (char *) the malloc (for outlen);
    Memset (outbuf,, 0, for outlen);
    char * = poutbuf outbuf; // pay more this conversion is to avoid iconv this function appears char (*) [255] type of real participation char ** parameter of type incompatible
    if (iconv (cd, & inbuf , & inlen, & poutbuf, & outlen) == -1)
        return "";

    std :: string strTemp (outbuf); // this case, after a string of strTemp transcoding
    iconv_close (CD);
    return strTemp;
}

Guess you like

Origin www.linuxidc.com/Linux/2019-07/159632.htm