#IT明星不是梦#【AI语音聊天应用】第一部分，开发客户端

第一部分，开发客户端

1，部署后台服务

2，功能演示

3，系统架构

4，时序图

5，开发Java录音功能

6，实现多线程异步音频播放器

7，Java集成HttpClient

8，调用语音聊天接口

9，常见问题和解决方法

AI人工智能随着深度学习的快速发展而进入普遍应用，语音处理技术也达到了商用程度，AI开口说话已经成为现实，几大云服务商都开放了人工智能开发平台，使得AI应用系统的开发更加方便。接下来分享一个AI语音聊天应用的开发过程。

代码下载：

客户端应用：https://github.com/jextop/Walle

后台服务部署：https://github.com/jextop/StarterDeploy

后台服务代码：https://github.com/jextop/StarterApi

一，部署后台服务

Docker是一个开源的应用容器引擎，将应用以及依赖打包到一个可移植的镜像中，部署到服务器并运行在Container容器实例中。

资源编排Docker-compose定义和运行多个容器组成的应用系统，通过docker-compose.yml文件声明各个服务，作为一个整体来完成应用的创建和启动。

我们使用Docker快捷高效部署应用，先看操作步骤和用时：

操作	脚本	用时
资源编排，配置镜像	docker-compose.yml	-
安装Docker，Ubuntu服务器可用脚本	docker.sh	-
拉取镜像，有更新时下载	pull.sh	-
启动服务容器	up.sh	10s
查看运行日志	logs.sh	-
停止服务	down.sh	15s

脚本文件：https://github.com/jextop/StarterDeploy

├── docker.sh # Ubuntu环境下自动安装Docker

├── docker-compose.yml # 资源编排文件，配置各个服务

├── pull.sh # 拉取需要的Docker镜像

├── up.sh # 一键启动依赖的运行环境

├── logs.sh # 查看容器运行日志

├── down.sh # 停止运行环境

1，安装Docker

Ubuntu服务器可以运行脚本docker.sh自动安装

https://docs.docker.com/install/linux/docker-ce/ubuntu/

https://docs.docker.com/docker-for-windows/install/

2，pull.sh拉取镜像

批量拉取需要的镜像，完成后docker images查看：

3，up.sh启动容器

脚本中封装了docker-compose up -d命令，启动后docker ps查看容器实例：

4，logs.sh查看日志

定制命令突出显示重要信息：

docker-compose logs -ft | grep --color -i -e error -e warn -e version -e exception

5，down.sh停止服务

运行docker-compose down --remove-orphans停止并删除容器：

6，打开管理后台，查看服务

延伸阅读：【0成本】搭建Docker镜像自动构建系统，Docker一键部署，3小时学以致用

二，功能演示

1，启动客户端

脚本文件：https://github.com/jextop/Walle

├── launch.sh

运行launch.sh将自动打包启动客户端。

2，客户端说话，5秒倒计时

3，请求后台服务，得到返回的音频数据

程序运行日志中查看信息。

4，客户端播放声音

三，系统架构

AI语音聊天应用分为客户端和后台服务两部分，客户端通过REST接口和后台服务交互，可开发多平台前端。

后台服务接入百度AI语音处理和图灵机器人云服务，为了提高聊天场景特性又接入了百度IP地址定位服务，系统架构图如下：

云服务	接口	功能
百度AI语音处理	语音合成tts	文字 -> 声音
百度AI语音处理	语音识别asr	声音 -> 文字
图灵机器人	智能聊天	文字 -> 文字
百度IP地址定位	IP定位	IP -> 地址

四，时序图

客户端完成录音后，将音频数据通过REST接口发送给后台服务，后台服务将依次调用语音识别、智能聊天、语音合成云服务，得到音频数据，并返回给前端。

IP定位得到的地址信息将辅助智能聊天，比如文字“天气”加上地址“上海张江”，将返回准确的天气情况，否则返回的信息可能是“你在哪里？”

五，开发Java录音功能

客户端代码文件：https://github.com/jextop/Walle

├── audio

│ └── ChatUtil.java

│ └── Player.java

│ └── Recorder.java

│ └── RecordHelper.java

│ └── TimeListener.java

├── http

│ └── HttpUtil.java

├── App.java

Java录音功能封装在Recorder类中，实现Runnable接口支持异步运行，通过TimeListener通知主程序，在录音结束时调用后台服务。RecordHelper.java对音频数据进行管理，提供播放和保存文件等辅助功能方便测试。

调用javax.sound包中的AudioSystem和DataLine实现录音功能：

class Recorder implements Runnable {
    private ByteArrayOutputStream byteOutputStream;
    private TimeListener timeListener;

    public Recorder(ByteArrayOutputStream outputStream, TimeListener timeListener) {
        this.byteOutputStream = outputStream;
        this.timeListener = timeListener;
    }

    @Override
    public void run() {
        TargetDataLine targetDataLine = null;
        try {
            AudioFormat audioFormat = FormatUtil.getAudioFormat();
            DataLine.Info info = new DataLine.Info(TargetDataLine.class, audioFormat);
            targetDataLine = (TargetDataLine) (AudioSystem.getLine(info));
            targetDataLine.open(audioFormat);
            targetDataLine.start();

            byte[] bytes = new byte[1024 * 8];
            while (true) {
                int count = targetDataLine.read(bytes, 0, bytes.length);
                if (count > 0) {
                    byteOutputStream.write(bytes, 0, count);
                }
            }
        } catch (LineUnavailableException e) {
            System.err.println(e.getMessage());
        } finally {
            if (targetDataLine != null) {
                targetDataLine.close();
            }

            try {
                byteOutputStream.close();
            } catch (IOException e) {
                System.err.println(e.getMessage());
            }
        }

        long seconds = (System.currentTimeMillis() - msStart) / 1000;
        if (timeListener != null) {
            timeListener.stopped(seconds);
        }
    }
}

六，实现多线程异步音频播放器

音频播放功能封装在Player.java中，将AudioInputStream数据写入到DataLine中，支持异步播放。

class Player implements Runnable {
    private AudioFormat audioFormat;
    private AudioInputStream audioStream;

    public Player(AudioInputStream audioStream, AudioFormat audioFormat) {
        this.audioStream = audioStream;
        this.audioFormat = audioFormat;
    }

    @Override
    public void run() {
        try {
            play(audioStream, audioFormat);
        } catch (LineUnavailableException e) {
            System.err.println(e.getMessage());
        } catch (IOException e) {
            System.err.println(e.getMessage());
        } finally {
            if (audioStream != null) {
                try {
                    audioStream.close();
                } catch (IOException e) {
                    System.err.println(e.getMessage());
                }
            }
        }
    }

    public static void play(AudioInputStream audioStream, AudioFormat audioFormat) throws IOException, LineUnavailableException {
        if (audioStream == null) {
            return;
        }

        if (audioFormat == null) {
            audioFormat = audioStream.getFormat();
        }

        DataLine.Info lineInfo = new DataLine.Info(SourceDataLine.class, audioFormat);
        SourceDataLine dataLine = (SourceDataLine) AudioSystem.getLine(lineInfo);
        dataLine.open(audioFormat, 1024);
        dataLine.start();

        byte[] bytes = new byte[1024];
        int len;
        while ((len = audioStream.read(bytes)) > 0) {
            dataLine.write(bytes, 0, len);
        }

        dataLine.drain();
        dataLine.close();
    }
}

七，Java集成HttpClient

HttpClient是一个高效的、功能丰富的HTTP客户端编程工具包，以编程的方式通过API传输和接收HTTP消息。

1，在pom.xml中引入httpclient和解析json数据时用到的依赖

<dependency>
    <groupId>com.alibaba</groupId>
    <artifactId>fastjson</artifactId>
    <version>1.2.60</version>
</dependency>


<dependency>
    <groupId>org.apache.httpcomponents</groupId>
    <artifactId>httpclient</artifactId>
    <version>4.5.2</version>
</dependency>
<dependency>
    <groupId>org.apache.httpcomponents</groupId>
    <artifactId>httpmime</artifactId>
    <version>4.5</version>
</dependency>

2，封装HttpUtil.java

创建RequestConfig和连接池PoolingHttpClientConnectionManager

3，封装sendHttpPost()函数

客户端调用后台REST接口时发送Post请求，填写请求参数。

4，封装ResponseHandler<T>，处理返回结果

ResponseHandler<T>是httpclient包内提供的接口，实现函数handleResponse()处理HTTP返回结果，封装处理逻辑。

package org.apache.http.client;

import java.io.IOException;
import org.apache.http.HttpResponse;

public interface ResponseHandler<T> {
T handleResponse(HttpResponse var1) throws ClientProtocolException, IOException;
}

封装的3个HTTP返回结果处理类，简化逻辑，提高开发效率。

代码文件	HTTP返回结果类型
RespStr.java	返回字符串
RespJsonObj.java	返回JSONObject
RespData.java	返回二进制数据

1) RespStr.java：读取HttpResponse返回的内容，格式化为String字符串

- 调用httpResponse.getEntiry()获取返回结果

- 调用ContentType.get()获取内容类型

- 调用ContentType.getCharset()获取编码格式

- 调用EntityUtils.toString()将返回结果格式化为字符串

public class RespStr implements ResponseHandler<String> {
    @Override
    public String handleResponse(HttpResponse httpResponse) throws ClientProtocolException, IOException {
        HttpEntity entity = httpResponse.getEntity();
        ContentType contentType = ContentType.getOrDefault(entity);
        Charset charset = contentType.getCharset();
        return EntityUtils.toString(entity, charset == null ? Charset.forName("utf-8") : charset);
    }
}

2) RespJsonObj.java：在返回结果为JSON对象时，转换成JSONObject返回

public class RespJsonObj implements ResponseHandler<JSONObject> {
    @Override
    public JSONObject handleResponse(HttpResponse resp) throws ClientProtocolException, IOException {
        HttpEntity entity = resp.getEntity();
        ContentType contentType = ContentType.getOrDefault(entity);
        Charset charset = contentType.getCharset();
        String jsonStr = EntityUtils.toString(entity, charset == null ? Charset.forName("utf-8") : charset);

        // parse JSON object
        return JsonUtil.parseObj(jsonStr);
    }
}

3) RespData.java：在HTTP返回二进制数据时，从Entity中读取二进制内容，并从Header中获取文件名称等信息。

public class RespData implements ResponseHandler<byte[]> {
    private static final String fileNameFlag = "attachment;fileName=";

    private byte[] bytes;
    private String fileName;

    public byte[] getBytes() {
        return bytes;
    }

    public String getFileName() {
        return fileName;
    }

    @Override
    public byte[] handleResponse(HttpResponse response) throws ClientProtocolException, IOException {
        // Header: Content-Disposition: attachment;fileName=abc.txt
        Header header = response.getFirstHeader("Content-Disposition");
        String headerValue = header.getValue();
        if (headerValue.startsWith(fileNameFlag)) {
            fileName = headerValue.substring(fileNameFlag.length(), headerValue.length());
        }

        HttpEntity entity = response.getEntity();
        bytes = EntityUtils.toByteArray(entity);
        return bytes;
    }
}

八，调用语音聊天接口

客户端将录制的音频数据进行B64编码，请求参数中设置音频长度和格式信息，然后调用REST接口发送给后台服务，后台服务处理完成后，返回新的音频数据，客户端调用Player播放。

public class ChatUtil {
    public static void chat() {
        RecordHelper recordHelper = RecordHelper.getInst();
        final ByteArrayOutputStream data = recordHelper.save(new ByteArrayOutputStream());

        RespData resp = new RespData();
        byte[] ret = HttpUtil.sendHttpPost(
                "http://localhost:8011/speech/walle",
                null, new HashMap<String, Object>() {{
                    put("size", data.size());
                    put("format", "wav");
                    put("audio", B64Util.encode(data.toByteArray()));
                }}, resp
        );
        System.out.printf("%s, %s, %s, %d\n",
                resp.getContentType(), resp.getFileName(), resp.getFileExt(), resp.getContentLength()
        );

        if (ret != null && ret.length > 0) {
            Player.asyncPlay(ret);
        } else {
            // 播放自己的声音吧
            recordHelper.play();
        }
    }
}

注意正确配置后台服务接口地址。

九，常见问题和解决方法

l HTTP返回结果乱码，设置UTF-8仍然不能解决？

原因：处理HTTP请求返回结果时，出现乱码是因为设置Charset编码格式不正确，通常设置UTF-8可以解决大部分情况，但并不是所有Web服务器都一定使用UTF-8格式。

解决：正确的方法是获取内容编码时的格式：

- 调用httpResponse.getEntiry()获取返回结果

- 调用ContentType.get()获取内容类型

- 调用ContentType.getCharset()获取编码格式

- 调用EntityUtils.toString()将返回结果格式化为字符串

public class RespStr implements ResponseHandler<String> {
    @Override
    public String handleResponse(HttpResponse httpResponse) throws IOException {
        HttpEntity entity = httpResponse.getEntity();
        if (entity == null) {
            throw new ClientProtocolException("Response contains no content");
        }

        // 读取返回内容
        ContentType contentType = ContentType.getOrDefault(entity);
        Charset charset = contentType.getCharset();
        return EntityUtils.toString(entity, charset == null ? Charset.forName("utf-8") : charset);
    }
}

ResponseHandler<T>是httpclient包内提供的接口，实现函数handleResponse()处理HTTP返回结果。

l HTTP返回结果调用ContentType读取不到Charset

原因：不是所有Web服务器都正确返回了内容编码格式，那就再增加一个错误判断吧。

解决：从ContentType.getCharset()读取不到Charset时，默认使用UTF-8

l 录制音频时客户端能播放，但后台服务返回错误“数据格式不正确/不支持”？

原因：后台服务当前支持格式为pcm和wav，录制音频时设置AudioFormat使用PCM_SIGNED

解决：封装FormatUtil.java使用统一的音频格式：

import javax.sound.sampled.AudioFormat;

public class FormatUtil {
    public static AudioFormat getAudioFormat() {
        AudioFormat.Encoding encoding = AudioFormat.Encoding.PCM_SIGNED;
        float rate = 16000f;
        int sampleSize = 16;
        int channels = 1;
        boolean bigEndian = true;
        return new AudioFormat(encoding, rate, sampleSize, channels, (sampleSize / 8) * channels, rate, bigEndian);
    }
}

#IT明星不是梦#【AI语音聊天应用】第一部分，开发客户端

猜你喜欢