Lesson JsonCpp read UTF-8 encoding of the text with BOM

Interpretation of knowledge
             in UTF-8 encoded by Windows Notepad to open the file, the default will save time, add more files to the beginning of the three bytes EF BB BF, show text encoding is UTF-8, this technique is called BOM (byte order mark, is a byte order mark). This does not happen in Unix or Linux operating systems. If the text is encoded in ANSI format, nor to add other characters.


Text read difference
text byte stream with the BOM
"Xia Nuo \ r \ n {\ r \ n \ t \" version \ ": \" 1.0.0 \ ", \ r \ n \ t \" messagetype \ ": \" alarm \ " , \ r \ n \ t \" cmdtype \ ": 10009, \ r \ n \ t \" sn \ ": \" 202039248932482934 \ "

不带BOM的文本字节流
"[\r\n{\r\n\t\"version\": \"1.0.0\",\r\n\t\"messagetype\": \"alarm\",\r\n\t\"cmdtype\": 10009,\r\n\t\"sn\": \"202039248932482934\"


Problems
 passing text byte stream with a BOM of the case to resolve JsonCpp default, parsing is certainly not out, because more than three bytes EF BB BF, so they need these three bytes removed from the text


代码
 std::ifstream ifs;
 ifs.open(pFileName, std::ifstream::in | std::ifstream::binary);

 std::string str((std::istreambuf_iterator<char>(ifs)), std::istreambuf_iterator<char>());
 std::string strValidJson;
 if ((0xef == (unsigned char)str[0]) && (0xbb == (unsigned char)str[1]) && (0xbf == (unsigned char)str[2]))
 {
  strValidJson = str.substr(3, str.length() - 3);
 }
 else
 {
  strValidJson = str;
 }
 //开始解析Json文本
 Json::Reader reader;
 Json::Value root;
 if (NULL == reader.parse(strJson, root)) return;


Guess you like

Origin blog.51cto.com/fengyuzaitu/2412115