01、libcurl介绍
关于libcurl的介绍,我写过一篇博客,不清楚的可以转过去瞧瞧:2019配置Http协议、libcurl第三方库进行POST通讯
02、libcurl测试
按照上面博客中的方式做完之后,我们就能写一个小项目测试一下效果了,我们以百度为例,测试抓取:https://www.baidu.com/此网站的内容。
引用官网的一个实例步骤为原模板
/***************************************************************************
* _ _ ____ _
* Project ___| | | | _ \| |
* / __| | | | |_) | |
* | (__| |_| | _ <| |___
* \___|\___/|_| \_\_____|
*
* Copyright (C) 1998 - 2021, Daniel Stenberg, <[email protected]>, et al.
*
* This software is licensed as described in the file COPYING, which
* you should have received as part of this distribution. The terms
* are also available at https://curl.se/docs/copyright.html.
*
* You may opt to use, copy, modify, merge, publish, distribute and/or sell
* copies of the Software, and permit persons to whom the Software is
* furnished to do so, under the terms of the COPYING file.
*
* This software is distributed on an "AS IS" basis, WITHOUT WARRANTY OF ANY
* KIND, either express or implied.
*
***************************************************************************/
/* <DESC>
* Shows how the write callback function can be used to download data into a
* chunk of memory instead of storing it in a file.
* </DESC>
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <curl/curl.h>
//定义一个结构体,记录每次回调的内容与数据长度
typedef struct MemoryStruct {
char *memory;
size_t size;
}MemoryStruct;
//定义一个全局静态函数,响应服务器的数据
/* 参数说明
* void* contents: 此参数为服务器响应的源数据,即服务器发过来的数据(原封不变的数据)
* size_t size: 此参数为发送的多少个块儿(类似fwrite函数中的第二参数)
* size_t nmemb: 此参数为发送的块中的数量大小(类似fwrite函数中的第三参数)
* void* userp: 此参数为与外部交互的指针通常为(FILE*),以实际为参考,看下面用法
*/
static size_t WriteMemoryCallback(void *contents, size_t size, size_t nmemb, void *userp)
{
size_t realsize = size * nmemb; //此次回调的数据总大小
struct MemoryStruct *mem = (struct MemoryStruct *)userp; //外部申请空间、内部追加数据方式
//为结构体成员memory动态申请空间
char *ptr = realloc(mem->memory, mem->size + realsize + 1);
if(!ptr) {
/* out of memory! */
printf("not enough memory (realloc returned NULL)\n");
return 0;
}
mem->memory = ptr;
memcpy(&(mem->memory[mem->size]), contents, realsize);
mem->size += realsize;
mem->memory[mem->size] = 0;
return realsize; //只要realsize不为0,回调就要一直被调用
}
int main(void)
{
CURL *curl_handle;
CURLcode res;
struct MemoryStruct chunk;
struct curl_slist* headers = NULL; //http报文头
chunk.memory = malloc(1); /* will be grown as needed by the realloc above */
chunk.size = 0; /* no data at this point */
curl_global_init(CURL_GLOBAL_ALL);
/* init the curl session */
curl_handle = curl_easy_init();
/* specify URL to get */
curl_easy_setopt(curl_handle, CURLOPT_URL, "http://www.baidu.com/");
/* send all data to this function */
curl_easy_setopt(curl_handle, CURLOPT_WRITEFUNCTION, WriteMemoryCallback);
/* we pass our 'chunk' struct to the callback function 此参数为回调的第4参数*/
curl_easy_setopt(curl_handle, CURLOPT_WRITEDATA, (void *)&chunk);
/* some servers don't like requests that are made without a user-agent
field, so we provide one */
curl_easy_setopt(curl_handle, CURLOPT_USERAGENT, "libcurl-agent/1.0");
/* get it! */
res = curl_easy_perform(curl_handle);
/* check for errors */
if(res != CURLE_OK) {
fprintf(stderr, "curl_easy_perform() failed: %s\n",
curl_easy_strerror(res));
}
else {
/*
* Now, our chunk.memory points to a memory block that is chunk.size
* bytes big and contains the remote file.
*
* Do something nice with it!
*/
char* buf = chunk.memory;
FILE* fptr = fopen(strFilePath,"wb");
if(!fptr)/*strFilePath为文件路径(完整且加名字的路径)*/
{
printf("fopen is failed......\n");
return 0;
}
size_t bRet = fwrite(buf,1,Chunk.size,fptr);
if(bRet >0)
{
printf("write Content to file is success!");
return 0;
}
fclose(fptr);
printf("%lu bytes retrieved\n", (unsigned long)chunk.size);
}
/* cleanup curl stuff */
curl_easy_cleanup(curl_handle);
free(chunk.memory);
chunk.memory = NULL;
/* we're done with libcurl, so clean it up */
curl_global_cleanup();
return 0;
}
03、fwrite追加模式说明
C函数fopen中可对打开的文件设置多种读写模式,需要说明下fwrite的追加模式是怎么回事,网上很多文章都说的不清不楚,只说a是追加,w不能追加,其实很多误解,这里需要澄清(通过案例测试ok):
1. w:表示fopen文件时会清空掉原文件(如果存在)的信息,并重新写入,在不fclose文件的情况下,多次fwrite也是追加写入到文件末尾的,不会覆盖之前fwrite的内容。
2. a:表示fopen文件时会保留原文件(如果存在)的信息,并追加到末尾写入,每次fwrite写入到文件末尾。
其他说明:
w: 文本写入,只写
w+:可读可写
wb: 二进制写入,只写
wb+: 二进制写入或读出,可读可写
a:对fopen的文件追加写入,文本形式
ab:对fopen的文件追加写入,二进制形式
ab+:对fopen的文件追加写入或读出,二进制形式
04、小结
关于libcurl,以前有写过一篇基础入门介绍,但是后面运用的实际中时,在使用过程中出现了很多bug导致程序异常关闭,一直以为是这个通讯中出现了问题,困惑了2~3周都没有搞定,最后没办法了下载了源码,发现回调函数一定要是在类外的静态函数,不然就会导致每次的回调地址都不一样,导致一种错觉就是传递回调函数中的第四个参数地址在变化,导致内容始终不对,谨以此文记录自己的libcurl使用经历。