Spring Cloud Netflix 微服务压力测试

目的：对微服务的提供者和消费者组建的集合进行压力测试，以发现可能的问题和解决的方法。

创建一个客户端项目(Feign)，提供http接口给JMeter调用，该接口使用Feign客户端请求另外一个机器上的一个微服务：

JMeter --> 客户端（Feign、Hystrix） --> 微服务（user-service）

客户端上的代码：

[java]view plain copy
 
@RestController  
public class UserController {  
    protected final Logger logger = LoggerFactory.getLogger(getClass());  
      
    @Autowired  
    UserServiceClient userServiceClient;  
  
    /** 
     * 根据用户id获取电话号码 
     * @param userId 
     * @return 电话号码 
     * 
     * 调用例子：http://localhost:12345/getPhoneNoByUserId?userId=263508 
     */  
    @RequestMapping(value = "/getPhoneNoByUserId", method = RequestMethod.GET)  
    public String getPhoneNoByUserId(@RequestParam Integer userId) {  
        logger.debug("getPhoneNoByUserId received. userId={}", userId);  
          
        return userServiceClient.getPhoneNoByUserId(userId);  
    }  
}  

客户端调用user-service微服务，测试时去掉断路器，已方便发现错误：

[java]view plain copy
 
/** 
 * 调用user微服务的客户端接口 
 * @author XuJijun 
 * 
 */  
//@FeignClient(value="user-service", fallback=UserServiceClientHystrix.class)  
@FeignClient(value="user-service")  
public interface UserServiceClient {  
    /** 
     * 根据userId获取电话号码 
     */  
    @RequestMapping(value = "/user/getPhoneNoByUserId", method = RequestMethod.GET)  
    public String getPhoneNoByUserId(@RequestParam(value = "userId") Integer userId);  
}  

错误情况1（Thread Group配置线程数为10的时候，几乎必现）：

com.netflix.hystrix.exception.HystrixRuntimeException ... timed-out and no fallback available.

原因：

Hystrix缺省超时判断为1秒钟，由于网络问题，有些请求超过1秒钟之后才接收到。

解决：

配置修改请求超时时长(application.yml)：

[plain]view plain copy
 
hystrix:  
  command:  
    default:  
      execution:  
        isolation:  
          thread:  
            timeoutInMilliseconds: 30000 #缺省为1000  

错误情况2（ Thread Group配置线程数为1000的时候，几乎必现）：

com.netflix.hystrix.exception.HystrixRuntimeException ... could not be queued for execution...

原因：

Hystix请求线程池缺省为最大10个线程，压力测试情况下，很容易爆发超过10个请求。

解决：

配置修改线程池中的coreSize (application.yml)。

[plain]view plain copy
 
hystrix:  
  threadpool:  
    default:  
      coreSize: 500 #缺省为10  

解决以上两个问题之后，配置线程数为10000的时候，失败率为0。

附：这个参数的配置，有个基本得公式可以follow：

requests per second at peak when healthy × 99th percentile latency in seconds + some breathing room

每秒最大支撑的请求数 (99%平均响应时间 + 一个缓冲值)

比如：每秒能处理1000个请求，99%的请求响应时间是60ms，那么公式是：1000*(0.060+0.012)

结论：

1、先保持timeout的默认值（1000ms），除非需要修改（其实通常会修改）

2、先保持threadpool的的线程数为10个，除非需要更多

3、根据实际业务的请求并发数，配置压力测试并发线程数，然后调整上面两个参数，直到满足需求为止。

附：

Hystrix相关的常用配置信息：

超时时间（默认1000ms，单位：ms）
- hystrix.command.default.execution.isolation.thread.timeoutInMilliseconds
- hystrix.command.HystrixCommandKey.execution.isolation.thread.timeoutInMilliseconds
线程池核心线程数
- hystrix.threadpool.default.coreSize（默认为10）
Queue
- hystrix.threadpool.default.maxQueueSize（最大排队长度。默认-1，使用SynchronousQueue。其他值则使用 LinkedBlockingQueue。如果要从-1换成其他值则需重启，即该值不能动态调整，若要动态调整，需要使用到下边这个配置）
- hystrix.threadpool.default.queueSizeRejectionThreshold（排队线程数量阈值，默认为5，达到时拒绝，如果配置了该选项，队列的大小是该队列）
  - 注意：如果maxQueueSize=-1的话，则该选项不起作用
断路器
- hystrix.command.default.circuitBreaker.requestVolumeThreshold（当在配置时间窗口内达到此数量的失败后，进行短路。默认20个）
- hystrix.command.default.circuitBreaker.sleepWindowInMilliseconds（短路多久以后开始尝试是否恢复，默认5s）
- hystrix.command.default.circuitBreaker.errorThresholdPercentage（出错百分比阈值，当达到此阈值后，开始短路。默认50%）
fallback
- hystrix.command.default.fallback.isolation.semaphore.maxConcurrentRequests（调用线程允许请求HystrixCommand.GetFallback()的最大数量，默认10。超出时将会有异常抛出，注意：该项配置对于THREAD隔离模式也起作用）

转载出处：http://blog.csdn.net/ClementAD/article/details/54315805

Spring Cloud Netflix 微服务压力测试

猜你喜欢