oracle常见异常——io异常,connection reset

从Oracle官网论坛里找到一个帖子,讨论的问题和我遇到的问题类似,但提出的问题原因和解决方法比较有意思。按照帖子里的说法,问题的根因和Java的安全随机数生成器的实现原理相关。

java.security.SecureRandom is a standard API provided by sun. Among various methods offered by this class void nextBytes(byte[]) is one. This method is used for generating random bytes. Oracle 11g JDBC drivers use this API to generate random number during login. Users using Linux have been encountering SQLException(“Io exception: Connection reset”).

The problem is two fold
1. The JVM tries to list all the files in the /tmp (or alternate tmp directory set by -Djava.io.tmpdir) when SecureRandom.nextBytes(byte[]) is invoked. If the number of files is large the method takes a long time to respond and hence cause the server to timeout
2. The method void nextBytes(byte[]) uses /dev/random on Linux and on some machines which lack the random number generating hardware the operation slows down to the extent of bringing the whole login process to a halt. Ultimately the the user encounters SQLException(“Io exception:Connection reset”)

Users upgrading to 11g can encounter this issue if the underlying OS is Linux which is running on a faulty hardware.

CauseThe cause of this has not yet been determined exactly. It could either be a problem in your hardware or the fact that for some reason the software cannot read from /dev/random

SolutionChange the setup for your application, so you add the next parameter to the java command:

-Djava.security.egd=file:/dev/../dev/urandom

现场实施人员对于这个帖子里的信息比较感兴趣。另外在测试环境经过多次重试,顺利复现问题并成功的提取到了发生问题时的调用栈。分析测试环境里提取到的栈文件,发现和上述帖子里描述的调用过程非常近似,说明帖子里方法很有希望解决我遇到的问题。因而按照帖子里的修改方法,在测试环境和生产环境做了多次验证,惊喜的发现问题得到了解决。

最终的解决方法

修改应用的JVM参数,方法找到有如下几种:

-Djava.security.egd=file:/dev/../dev/urandom
-Djava.security.egd=file:/dev/./urandom
-Djava.security.egd=file:/dev/urandom #据说这种方法有Bug,没有做进一步的验证;也没有查阅过代码,所以不了解问题在哪。
后来查阅其它资料时发现,原来JRE的java.security文件对变量java.security.egd早有定义。

#
# Select the source of seed data for SecureRandom. By default an
# attempt is made to use the entropy gathering device specified by
# the securerandom.source property. If an exception occurs when
# accessing the URL then the traditional system/thread activity
# algorithm is used.
#
# On Solaris and Linux systems, if file:/dev/urandom is specified and it
# exists, a special SecureRandom implementation is activated by default.
# This “NativePRNG” reads random bytes directly from /dev/urandom.
#
# On Windows systems, the URLs file:/dev/random and file:/dev/urandom
# enables use of the Microsoft CryptoAPI seed functionality.
#
securerandom.source=file:/dev/urandom
#
# The entropy gathering device is described as a URL and can also
# be specified with the system property “java.security.egd”. For example,
# -Djava.security.egd=file:/dev/urandom
# Specifying this system property will override the securerandom.source
# setting.

随机数生成器

如果不是为了解决问题,平时也不会去刻意查阅底层实现相关的原理,这次是个好机会。网上关于/dev/random的介绍很多,只列出要点:

1)/dev/random是Linux内核提供的安全随机数生成设备;

2)/dev/random依赖系统中断信息来生成随机数,当设备数目比较少时,产生随机数的速度比较慢,如果应用对随机数的需求比较大时就会供不应求;

3)/dev/random在读取时会阻塞调用线程;

4)/dev/urandom是/dev/random的改良版本,解决了随机数生成慢、阻塞调用的问题,但同时稍微降低了安全性;

5)Linux环境下man random命令可以查阅到/dev/random和/dev/urandom的介绍,比较详尽;

参考资料

1)https://community.oracle.com/message/3701989

2)http://www.usn-it.de/index.php/2009/02/20/oracle-11g-jdbc-driver-hangs-blocked-by-devrandom-entropy-pool-empty

3)http://bugs.java.com/bugdatabase/view_bug.do?bug_id=7003784

4)http://bugs.java.com/bugdatabase/view_bug.do?bug_id=6202721

猜你喜欢

转载自blog.csdn.net/weixin_41350766/article/details/80063016