Qt 调用微软认知 语音识别服务

最近需要验证一下语音,测试了微软的离线在线这些,测试微软认知语音识别服务遇到一些问题,记录一下;


第一步需要在微软认知服务网站上订阅一个试用码。


第二部获取Toekn

POST https://api.cognitive.microsoft.com/sts/v1.0/issueToken
Content-Length: 0

Ocp-Apim-Subscription-Key ASCIIYour subscription key.


主要代码如下:

Authentication::Authentication()
{
m_nmNetAccess = new QNetworkAccessManager(this);
connect(m_nmNetAccess, SIGNAL(finished(QNetworkReply*)), this, SLOT(slotFinishReply(QNetworkReply *)));
init();
}
void Authentication::init()
{
QString urlAdress = "https://api.cognitive.microsoft.com/sts/v1.0/issueToken";
QNetworkRequest request;
QSslConfiguration config;
config.setPeerVerifyMode(QSslSocket::VerifyNone);
config.setProtocol(QSsl::TlsV1_0OrLater);
m_authenticationRequest.setSslConfiguration(config);
m_authenticationRequest.setUrl(urlAdress);
m_authenticationRequest.setRawHeader("Content-Length", 0);
m_authenticationRequest.setRawHeader("Content-type", "application/x-www-form-urlencoded");
m_authenticationRequest.setRawHeader("Ocp-Apim-Subscription-Key", "your key");
QByteArray array;	
m_nmNetAccess->post(m_authenticationRequest, array);
connect(&m_timerExpired, SIGNAL(timeout()), this, SLOT(updateToken()));
m_timerExpired.start(540000);						//9分钟更新一次token
}
void Authentication::slotFinishReply(QNetworkReply *reply)
{
reply->ignoreSslErrors();
m_token = reply->readAll();
qDebug() << "TOKEN: " << m_token;
reply->deleteLater();
}

返回值是一段基于Base64编码的数据,无需处理直接保存,下面会用到。数据需要在11分钟之内更新,否则会失效。


第三部Post语音数据

POST /recognize?scenarios=catsearch&appid=f84e364c-ec34-4773-a783-73707bd9a585&locale=en-US&device.os=wp7&version=3.0&format=xml&requestid=1d4b6030-9099-11e0-91e4-0800200c9a66&instanceid=1d4b6030-9099-11e0-91e4-0800200c9a66 HTTP/1.1
Host: speech.platform.bing.com
Content-Type: audio/wav; samplerate=16000
Authorization: Bearer [Base64 access_token]

(audio data)
具体解释看官方文档,实现代码如下:

RSPluginOnLinePrivate::RSPluginOnLinePrivate(RSPluginMSOnLine* parent)
{
m_parent = parent;
m_nmNetAccess = new QNetworkAccessManager(this);
connect(m_nmNetAccess, SIGNAL(finished(QNetworkReply*)), this, SLOT(slotFinishReply(QNetworkReply *)));
m_authentication = new Authentication();
QString urlAdress = "https://speech.platform.bing.com/recognize";
QSslConfiguration config;
config.setPeerVerifyMode(QSslSocket::VerifyNone);
config.setProtocol(QSsl::TlsV1_0OrLater);
m_recogniseRequest.setSslConfiguration(config);
//to send record
QString _tmpString = QString("?scenarios=smd&appid=D4D52672-91D7-4C74-8AD8-42B1D98141A5&instanceid=565D69FF-E928-4B7E-87DA-9A750B96D9E3&locale=zh-CN&format=json&version=3.0&device.os=wp7&requestid=%1").arg(QUuid::createUuid().toString());
urlAdress += _tmpString;
m_recogniseRequest.setUrl(urlAdress);
m_recogniseRequest.setRawHeader("Accept", "application/json;text/xml");
m_recogniseRequest.setRawHeader("Content-type", "audio/wav; codec=""audio/pcm""; samplerate=8000");
m_recogniseRequest.setRawHeader("Host", "speech.platform.bing.com");
m_recogniseRequest.setRawHeader("Connection", "keep-alive"); 
m_recogniseRequest.setRawHeader("SendChunked", "true");
}
void RSPluginOnLinePrivate::recognize(const QByteArray& array)
{
QString token = m_authentication->token();// "Bearer ";// +m_token;
token += "Bearer "+token;
m_recogniseRequest.setRawHeader("Authorization", token.toUtf8());
m_nmNetAccess->post(m_recogniseRequest, array);
}
void RSPluginOnLinePrivate::slotFinishReply(QNetworkReply *reply)
{
reply->ignoreSslErrors();
QJsonDocument doc = QJsonDocument::fromJson(reply->readAll());
QString ss = doc.toJson(QJsonDocument::Indented);

QJsonObject object = doc.object();
QString result;
bool bSuccess = false;	
if (object["header"].isObject())
{
	QJsonObject obj = object["header"].toObject();
	QString status = obj["status"].toString();
	bSuccess = (status == "success") ? true : false;
}
if (bSuccess)
{
	if (object["results"].isArray())
	{
		QJsonArray array = object["results"].toArray();
		for (auto val : array)
		{
			if (val.isObject())
			{
				QJsonObject obj = val.toObject();
				result = obj["name"].toString();
			}
		}
	}
	
		m_parent->notice(result);
		
	}
	reply->deleteLater();
}


void RSPluginOnLinePrivate::recognize(const QByteArray& array)中array为麦克风采集的语音数据,需要加上wav头。

HttpRequst参数具体说明可以参考微软官方文档,地址在:https://www.azure.cn/cognitive-services/en-us/Speech-api/documentation/API-Reference-REST/BingVoiceRecognition


具体的代码上传在http://download.csdn.net/detail/stafniejay/9713476


猜你喜欢

转载自blog.csdn.net/stafniejay/article/details/53694755