oracle 12.2 rac权限问题Linux-x86_64 Error: 13: Permission denied

1、故障背景

现场不知道为啥在rac2节点上执行了chmod 777 /oracle,导致oracle和grid的权限全部变掉了。通过核查\库的不正常然后重启了库发现库起不来了。根据现场的描述,目前只更改了rac2节点的权限,没有更改rac1的节点权限。

2、解决思路

目前有两种解决思路:

2.1、使用oracle官方方法

正确安装完集群软件后,在$GRID_HOME/crs/utl目录下会生成两个文件crsconfig_dirs、crsconfig_fileperms记录了核心文件和文件夹的权限,恢复也很方便。

使用root用户执行:

for 11.2

#cd <GRID_HOME>/crs/install/

#./rootcrs.pl -init

for 12c以上

#cd <GRID_HOME>/crs/install/

#./rootcrs.sh -init

2.2、使用操作系统权限设置命令getfacl,setfacl

通过节点rac1的正常环境,获取对应/oracle目录的用户权限,然后在传到rac2上去做恢复。

本例我们采用本方法。以下是我们找了一套测试rac环境,来测试这种场景下的恢复。

3、测试恢复步骤

3.1、首先备份rac2的目录权限
root@rac2[/soft/backup]#getfacl -pR /u01/app >/soft/backup/backup.txt

3.2、更改rac2的权限为777
root@rac2[/soft/backup]#chmod -R 777 /u01/app

3.3、测试停掉rac2上面的数据库实例然后在起,看能否起来
grid@rac2[/home/grid]$srvctl stop instance -d test -n rac2
grid@rac2[/home/grid]$crsctl stat res -t
ora.test.db
      1        ONLINE  ONLINE       rac1                     Open,HOME=/u01/app/o
                                                             racle/product/12.1.0
                                                             ,STABLE
      2        OFFLINE OFFLINE                               Instance Shutdown,ST
                                                             ABLE
3.4、再次启动,意料之中的报错了
grid@rac2[/home/grid]$srvctl start instance -d test -n rac2
PRCR-1013 : Failed to start resource ora.test.db
PRCR-1064 : Failed to start resource ora.test.db on node rac2
CRS-5017: The resource action "ora.test.db start" encountered the following error: 
ORA-00205: error in identifying control file, check alert log for more info
. For details refer to "(:CLSN00107:)" in "/u01/app/grid/grid/diag/crs/rac2/crs/trace/crsd_oraagent_oraclerac.trc".

CRS-2674: Start of 'ora.test.db' on 'rac2' failed
grid@rac2[/home/grid]$

3.5、我们check一下alert日志
grid@rac2[/home/grid]$locate alert_test2.log
/u01/app/oracle/diag/rdbms/test/test2/trace/alert_test2.log
2019-12-19T16:30:30.255541+08:00
Error attempting to elevate LMS1's priority: no further priority changes will be attempted for this process
.....
Decreasing number of high priority LMS from 2 to 0
2019-12-19T16:33:31.842137+08:00
WARNING: failed to register ASMB0 with ASM instance
2019-12-19T16:33:31.842550+08:00
Errors in file /u01/app/oracle/diag/rdbms/test/test2/trace/test2_asmb_25121.trc:
ORA-01034: ORACLE not available
ORA-27121: unable to determine size of shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 6761
Additional information: 5996550
Stopping background process RBAL
2019-12-19T16:33:32.844553+08:00
WARNING: ASMB0 exiting with error
2019-12-19T16:33:34.845548+08:00
Starting background process ASMB
2019-12-19T16:33:34.862397+08:00
ASMB started with pid=44, OS id=29615
2019-12-19T16:36:35.670841+08:00
WARNING: failed to register ASMB0 with ASM instance
WARNING: ASMB0 exiting with error
2019-12-19T16:36:35.700823+08:00
Starting background process ASMB
2019-12-19T16:36:35.717018+08:00
ASMB started with pid=44, OS id=14051
2019-12-19T16:39:36.844672+08:00
WARNING: failed to register ASMB0 with ASM instance
WARNING: ASMB0 exiting with error
2019-12-19T16:39:36.848684+08:00
ORA-00210: cannot open the specified control file
ORA-00202: control file: '+DATA/test/control02.ctl'
ORA-17503: ksfdopn:2 Failed to open file +DATA/test/control02.ctl
ORA-15001: diskgroup "DATA" does not exist or is not mounted
ORA-01034: ORACLE not available
ORA-27121: unable to determine size of shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 6761
Additional information: 5996550
ORA-00210: cannot open the specified control file
ORA-00202: control file: '+DATA/test/control01.ctl'
ORA-17503: ksfdopn:2 Failed to open file +DATA/test/control01.ctl
ORA-15001: diskgroup "DATA" does not exist or is not mounted

ORA-01034: ORACLE not available
ORA-27121: unable to determine size of shared memory segment
Linux-x86_64 Error: 13: Permission denied
Additional information: 6761
Additional information: 5996550
2019-12-19T16:39:36.850963+08:00
ORA-205 signalled during: ALTER DATABASE MOUNT /* db agent *//* {2:18183:4569} */...
2019-12-19T16:39:39.036289+08:00
License high water mark = 2
2019-12-19T16:39:39.036741+08:00
USER (ospid: 18565): terminating the instance
2019-12-19T16:39:40.040061+08:00
Instance terminated by USER, pid = 18565
报了一堆这样的错,其实你去检查磁盘组是mount的和控制文件啥的也都是存在的,有个Linux-x86_64 Error: 13: Permission denied,这个一般就是<ORACLE_HOME>/bin/oracle文件的权限导致的,很明显这个地方我们改掉了/u01/app下所有的文件权限。

3.6、获取rac1的权限
我们只改了rac2的/u01/app权限,rac1的权限是正常的,所以我们通过rac1的权限来恢复rac2
root@rac1[/root]#getfacl -pR /u01/app >/soft/backup.txt

3.7、传到rac2
root@rac1[/root]#getfacl -pR /u01/app >/soft/backup.txt
root@rac1[/root]#scp /soft/backup.txt rac2:/soft/backup_rac1.txt
backup.txt                                                                                      100%   24MB  77.3MB/s   00:00    
root@rac1[/root]#

3.8、替换/soft/backup_rac1.txt中主机名,数据库名,asm名
sed -i 's/rac1/rac2/g' /soft/backup_rac1.txt
sed -i 's/test1_/test2_/g' /soft/backup_rac1.txt
sed -i 's/ASM1/ASM2/g' /soft/backup_rac1.txt
注意这边有多少个实例都要改多少个,我们这边只有一个test库,所以只需要改test对应的实例,可以通过如下命令获取有多少个库
grid@rac1[/home/grid]$srvctl config database
joyce
test
上面获取到了两个库,joyce是rac one node单节点,所以只在rac1上有,rac2上并没有,不需要改。
root@rac2[/root]#sed -i 's/rac1/rac2/g' /soft/backup_rac1.txt
root@rac2[/root]#sed -i 's/test1_/test2_/g' /soft/backup_rac1.txt
root@rac2[/root]#sed -i 's/ASM1/ASM2/g' /soft/backup_rac1.txt

3.9、使用修改过的权限文件恢复rac2
setfacl --restore=/soft/backup_rac1.txt
该命令遇到文件不存在的会报错并跳过错误继续执行下面的所有文件,不会终止,所以不用担心文件不存在报错的问题。

3.10、检查关键文件权限
oraclerac@rac2[/home/oraclerac]$ll $ORACLE_HOME/bin/oracle
-rwsr-s--x 1 oraclerac oinstall 408857200 Dec 19 12:38 /u01/app/oracle/product/12.1.0/bin/oracle
oraclerac@rac2[/home/oraclerac]$
发现oracle文件权限已经回来了。

3.11、尝试再次启动test实例
grid@rac2[/home/grid]$srvctl start instance -d test -n rac2
grid@rac2[/home/grid]$crsctl stat res -t
ora.test.db
      1        ONLINE  ONLINE       rac1                     Open,HOME=/u01/app/o
                                                             racle/product/12.1.0
                                                             ,STABLE
      2        ONLINE  ONLINE       rac2                     Open,HOME=/u01/app/o
                                                             racle/product/12.1.0
                                                             ,STABLE

--------------------------------------------------------------------------------
发现已经可以起来了。

3.12、继续测试oracle官方提供的方法
停rac2
grid@rac2[/home/grid]$srvctl stop instance -d test -n rac2

3.13、修改rac2上的/u01/app目录权限
root@rac2[/root]#chmod -R 777 /u01/app

3.14、验证权限变了
grid@rac2[/home/grid]$ll $ORACLE_HOME/bin/oracle
-rwxrwxrwx 1 grid oinstall 373556656 Nov 13 22:21 /u01/app/grid/product/bin/oracle

3.15、此时起仍然起不来。

3.16、去grid_home/crs/install目录
cd <GRID_HOME>/crs/install/
root@rac2[/u01/app/grid/product/crs]#cd /u01/app/grid/product/crs/install
root@rac2[/u01/app/grid/product/crs/install]#./rootcrs.sh -init
/u01/app/grid/product/perl/bin/perl: symbol lookup error: /root/perl5/lib/perl5/auto/XML/Parser/Expat/Expat.so: undefined symbol: Perl_xs_apiversion_bootcheck
The command '/u01/app/grid/product/perl/bin/perl -I/u01/app/grid/product/perl/lib -I/u01/app/grid/product/crs/install /u01/app/grid/product/crs/install/rootcrs.pl -init' execution failed
很遗憾,报错了,这个错误是perl的版本问题,搞起来还是比较麻烦的。建议使用上面那种方式。

发布了177 篇原创文章 · 获赞 43 · 访问量 46万+

猜你喜欢

转载自blog.csdn.net/kadwf123/article/details/103619433
今日推荐