Sharing a problem caused by a wildcard about shell programming today.
1. Problem Code
cat test.logs
4567890 *
##*************************************##
rtyuio**tyuio432
##*************************************##
*rtyuiop*2* yuiop
##*************************************##
rtyuiop(3 * 4)iuytr
##*************************************##
8765432
cat script.sh
#!/usr/bin/env bash
# 主要功能是将 非##开头 的每行记录写入到文件中,每个文件保存一行记录
logsname=test.logs
i=100
while read line
do
if [[ $line =~ '##' ]];then
((i++))
else
echo $line >> $i.txt
fi
done < "${logsname}"
Script.sh run the script results:
Pictures from the red box section you can see:
4567890 *
Is replaced4567890 script.sh test.logs
rtyuiop(3 * 4)iuytr
is replaced withrtyuiop(3 100.txt 101.txt 102.txt script.sh test.logs 4)iuytr
Other row normal print results, why these two lines have problems? Other lines also have an asterisk, why not replace it?
2. Analysis
The output of the code based on the problem:echo $line >> $i.txt
First of all, tell us about the principles of shell script execution:
- shell script to read the entire file, and then execute each line from top to bottom
- The current line = 4567890 *, executed when the shell is assumed
echo $line >> $i.txt
when- First, shell responsible for the replacement
$line
value: 4,567,890 *, this time code is:echo 4567890 * >> $i.txt
- Then, shell before executing the echo command to check whether there are parameters in the command wildcards (PS: this time echo parameters: 4,567,890 *)
- Obviously,
*
a wildcard, shell is responsible for parsing the wildcard, shell will be treated as a wildcard search path or file name may match on disk: if they meet the requirements of a match, then be replaced (path extension); otherwise the wildcard as an ordinary character parameter passed to echo, and then processed by echo. - After parsing wildcard
*
is replacedscript.sh test.logs
, the parameters at this time is the echo command:4567890 script.sh test.logs
- Finally, shell execution
echo 4567890 script.sh test.logs
, then the results echo command execution redirected to a file.
- First, shell responsible for the replacement
Tips:
Wildcard looks a bit like a regular expression, but it is different with regular expressions, should not be confused with each other. Wildcard special characters can be understood as the shell can handle. And shell wildcard involved only "*,?, [], {}" These types.
Wildcard is supported by the shell itself, but needs the support of regular expressions related tools: grep, awk, vi, perl . In the text filtering tools, the regular expression are used, such as awk, sed, etc., the regular expression is for the contents of the file . Wildcards are used for the file name or path on, such as finding find, ls, cp and so on.
3. Solution
The above analysis can be known, $line
after being replaced, the wildcard *
is again parsed shell. How can there be any way to prevent the shell wildcard resolve it?
- Use reference variables:
"$line"
, across the quotes , thus"$line"
becomes a character string "4567890 *", instead of two separate strings. - Reference variable prevents the word and wildcard expansion (ie shell resolve the wildcard), and can prevent spaces in a variable, line breaks, causing interruption wildcard script and so on.
- In shell programming always use the reference variable way, this is a good and safe coding practices.