BASH的保护性编程技巧

jopen 11年前

这是我写BASH程序的招式。这里本没有什么新的内容，但是从我的经验来看，人们爱滥用BASH。他们忽略了计算机科学，而从他们的程序中创造的是“大泥球”（译注：指架构不清晰的软件系统）。

在此我告诉你方法，以保护你的程序免于障碍，并保持代码的整洁。

不可改变的全局变量

尽量少用全局变量
以大写命名
只读声明
用全局变量来代替隐晦的$0，$1等
在我的程序中常使用的全局变量：

readonly PROGNAME=$(basename $0)  readonly PROGDIR=$(readlink -m $(dirname $0))  readonly ARGS="$@"

一切皆是局部的

所有变量都应为局部的。

change_owner_of_file() {      local filename=$1      local user=$2      local group=$3        chown $user:$group $filename  }

change_owner_of_files() {      local user=$1; shift      local group=$1; shift      local files=$@      local i        for i in $files      do          chown $user:$group $i      done  }

自注释（self documenting）的参数
通常作为循环用的变量i，把它声明为局部变量是很重要的。
局部变量不作用于全局域。

kfir@goofy ~ $ local a  bash: local: can only be used in a function

main()

有助于保持所有变量的局部性
直观的函数式编程
代码中唯一的全局命令是：main

main() {      local files="/tmp/a /tmp/b"      local i        for i in $files      do          change_owner_of_file kfir users $i      done  }  main

一切皆是函数

唯一全局性运行的代码是：

- 不可变的全局变量声明

- main()函数

保持代码整洁
过程变得清晰

main() {      local files=$(ls /tmp | grep pid | grep -v daemon)  }

temporary_files() {      local dir=$1        ls $dir \          | grep pid \          | grep -v daemon  }    main() {      local files=$(temporary_files /tmp)  }

第二个例子好得多。查找文件是temporary_files()的问题而非main()的。这段代码用temporary_files()的单元测试也是可测试的。
如果你一定要尝试第一个例子，你会得到查找临时文件以和main算法的大杂烩。

test_temporary_files() {      local dir=/tmp        touch $dir/a-pid1232.tmp      touch $dir/a-pid1232-daemon.tmp        returns "$dir/a-pid1232.tmp" temporary_files $dir        touch $dir/b-pid1534.tmp        returns "$dir/a-pid1232.tmp $dir/b-pid1534.tmp" temporary_files $dir  }

如你所见，这个测试不关心main()。

调试函数

带-x标志运行程序：

bash -x my_prog.sh

只调试一小段代码，使用set-x和set+x，会只对被set -x和set +x包含的当前代码打印调试信息。

temporary_files() {      local dir=$1        set -x      ls $dir \          | grep pid \          | grep -v daemon      set +x  }

打印函数名和它的参数：

temporary_files() {      echo $FUNCNAME $@      local dir=$1        ls $dir \          | grep pid \          | grep -v daemon  }

</div>

调用函数：

temporary_files /tmp

会打印到标准输出：

temporary_files /tmp

代码的清晰度

这段代码做了什么？

main() {      local dir=/tmp        [[ -z $dir ]] \          && do_something...        [[ -n $dir ]] \          && do_something...        [[ -f $dir ]] \          && do_something...        [[ -d $dir ]] \          && do_something...  }  main

让你的代码说话：

is_empty() {      local var=$1        [[ -z $var ]]  }    is_not_empty() {      local var=$1        [[ -n $var ]]  }    is_file() {      local file=$1        [[ -f $file ]]  }    is_dir() {      local dir=$1        [[ -d $dir ]]  }    main() {      local dir=/tmp        is_empty $dir \          && do_something...        is_not_empty $dir \          && do_something...        is_file $dir \          && do_something...        is_dir $dir \          && do_something...  }  main

每一行只做一件事

用反斜杠\来作分隔符。例如：

temporary_files() {      local dir=$1        ls $dir | grep pid | grep -v daemon  }

可以写得简洁得多：

temporary_files() {      local dir=$1        ls $dir \          | grep pid \          | grep -v daemon  }

符号在缩进行的开始

符号在行末的坏例子：（译注：原文在此例中用了temporary_files()代码段，疑似是贴错了。结合上下文，应为print_dir_if_not_empty()）

print_dir_if_not_empty() {      local dir=$1        is_empty $dir && \          echo "dir is empty" || \          echo "dir=$dir"  }

好的例子：我们可以清晰看到行和连接符号之间的联系。

print_dir_if_not_empty() {      local dir=$1        is_empty $dir \          && echo "dir is empty" \          || echo "dir=$dir"  }

打印用法

不要这样做：

echo "this prog does:..."  echo "flags:"  echo "-h print help"

它应该是个函数：

usage() {      echo "this prog does:..."      echo "flags:"      echo "-h print help"  }

echo在每一行重复。因此我们得到了这个文档：

usage() {      cat <<- EOF      usage: $PROGNAME options        Program deletes files from filesystems to release space.       It gets config file that define fileystem paths to work on, and whitelist rules to       keep certain files.        OPTIONS:         -c --config              configuration file containing the rules. use --help-config to see the syntax.         -n --pretend             do not really delete, just how what you are going to do.         -t --test                run unit test to check the program         -v --verbose             Verbose. You can specify more then one -v to have more verbose         -x --debug               debug         -h --help                show this help            --help-config         configuration help        Examples:         Run all tests:         $PROGNAME --test all           Run specific test:         $PROGNAME --test test_string.sh           Run:         $PROGNAME --config /path/to/config/$PROGNAME.conf           Just show what you are going to do:         $PROGNAME -vn -c /path/to/config/$PROGNAME.conf      EOF  }

注意在每一行的行首应该有一个真正的制表符‘\t’。

在vim里，如果你的tab是4个空格，你可以用这个替换命令：

:s/^    /\t/

命令行参数

这里是一个例子，完成了上面usage函数的用法。我从Kirk’s blog post – bash shell script to use getopts with gnu style long positional parameters得到这段代码

cmdline() {      # got this idea from here:      # http://kirk.webfinish.com/2009/10/bash-shell-script-to-use-getopts-with-gnu-style-long-positional-parameters/      local arg=      for arg      do          local delim=""          case "$arg" in              #translate --gnu-long-options to -g (short options)              --config)         args="${args}-c ";;              --pretend)        args="${args}-n ";;              --test)           args="${args}-t ";;              --help-config)    usage_config &amp;&amp; exit 0;;              --help)           args="${args}-h ";;              --verbose)        args="${args}-v ";;              --debug)          args="${args}-x ";;              #pass through anything else              *) [[ "${arg:0:1}" == "-" ]] || delim="\""                  args="${args}${delim}${arg}${delim} ";;          esac      done        #Reset the positional parameters to the short options      eval set -- $args        while getopts "nvhxt:c:" OPTION      do           case $OPTION in           v)               readonly VERBOSE=1               ;;           h)               usage               exit 0               ;;           x)               readonly DEBUG='-x'               set -x               ;;           t)               RUN_TESTS=$OPTARG               verbose VINFO "Running tests"               ;;           c)               readonly CONFIG_FILE=$OPTARG               ;;           n)               readonly PRETEND=1               ;;          esac      done        if [[ $recursive_testing || -z $RUN_TESTS ]]; then          [[ ! -f $CONFIG_FILE ]] \              &amp;&amp; eexit "You must provide --config file"      fi      return 0  }

你像这样，使用我们在头上定义的不可变的ARGS变量：

main() {      cmdline $ARGS  }  main

单元测试

在更高级的语言中很重要。
使用shunit2做单元测试

test_config_line_paths() {      local s='partition cpm-all, 80-90,'        returns "/a" "config_line_paths '$s /a, '"      returns "/a /b/c" "config_line_paths '$s /a:/b/c, '"      returns "/a /b /c" "config_line_paths '$s   /a  :    /b : /c, '"  }    config_line_paths() {      local partition_line="$@"        echo $partition_line \          | csv_column 3 \          | delete_spaces \          | column 1 \          | colons_to_spaces  }    source /usr/bin/shunit2

这里是另一个使用df命令的例子：

DF=df    mock_df_with_eols() {      cat &lt;&lt;- EOF      Filesystem           1K-blocks      Used Available Use% Mounted on      /very/long/device/path                           124628916  23063572 100299192  19% /      EOF  }    test_disk_size() {      returns 1000 "disk_size /dev/sda1"        DF=mock_df_with_eols      returns 124628916 "disk_size /very/long/device/path"  }    df_column() {      local disk_device=$1      local column=$2        $DF $disk_device \          | grep -v 'Use%' \          | tr '\n' ' ' \          | awk "{print \$$column}"  }    disk_size() {      local disk_device=$1        df_column $disk_device 2  }

这里我有个例外，为了测试，我在全局域中声明了DF为非只读。这是因为shunit2不允许改变全局域函数。

原文链接： Kfir Lavi 翻译：伯乐在线 - cjpan
译文链接： http://blog.jobbole.com/73257/