歡迎來到Linux教程網
Linux教程網
Linux教程網
Linux教程網
Linux教程網 >> Linux基礎 >> Linux教程 >> Linux腳本Bash中的文本利器-awk

Linux腳本Bash中的文本利器-awk

日期:2017/2/28 16:23:29   编辑:Linux教程
awk確實很復雜,平常用的也是很少的一部分。邊查邊用,把平常用的做做筆記,也是方便自己的查找。

*調用方式
awk [-F field-separator] 'commands' input-file(s)
默認空格作為field-separator。

*模式
awk 'BEGIN{} {command} END{}' input.txt

*正則表達式
\ ^ $ . [] | () * + ?
但+(一個或多個) ?(出現頻率)不適應於grep和sed。

*匹配與不匹配
awk 'if ($3~/pattern/) actions' input.txt
awk 'if ($3!~/pattern/) actions' input.txt
awk 'if ($3=="abc") actions' input.txt

*awk內置變量
-----------------------------------------------------
A R G C 命令行參數個數
A R G V 命令行參數排列
E N V I R O N 支持隊列中系統環境變量的使用
FILENAME a w k浏覽的文件名
F N R 浏覽文件的記錄數
F S 設置輸入域分隔符,等價於命令行- F選項
N F 浏覽記錄的域個數
N R 已讀的記錄數
O F S 輸出域分隔符
O R S 輸出記錄分隔符
R S 控制記錄分隔符
-----------------------------------------------------

*awk內置字符串函
-----------------------------------------------------
g s u b ( r, s ) 在整個$ 0中用s替代r
g s u b ( r, s , t ) 在整個t中用s替代r
i n d e x ( s , t ) 返回s中字符串t的第一位置
l e n g t h ( s ) 返回s長度
m a t c h ( s , r ) 測試s是否包含匹配r的字符串
s p l i t ( s , a , f s ) 在f s上將s分成序列a
s p r i n t ( f m t , e x p ) 返回經f m t格式化後的e x p
s u b ( r, s ) 用$ 0中最左邊最長的子串代替s
s u b s t r ( s , p ) 返回字符串s中從p開始的後綴部分
s u b s t r ( s , p , n ) 返回字符串s中從p開始長度為n的後綴部分
-----------------------------------------------------

$1, $2...依次表示第一個,第二個。。。內部自動變量,$0表示整條記錄。
首先執行BEGIN,當awk讀完所有的輸入行後,執行END(如果有的化)。


And now for a grand example:

# This awk program collects statistics on two 
# "random variables" and the relationships
# between them. It looks only at fields 1 and
# 2 by default Define the variables F and G
# on the command line to force it to look at
# different fields. For example:
# awk -f stat_2o1.awk F=2 G=3 stuff.dat \
# F=3 G=5 otherstuff.dat
# or, from standard input:
# awk -f stat_2o1.awk F=1 G=3
# It ignores blank lines, lines where either
# one of the requested fields is empty, and
# lines whose first field contains a number
# sign. It requires only one pass through the
# data. This script works with vanilla awk
# under SunOS 4.1.3.
BEGIN{
F=1;
G=2;
}
length($F) > 0 && \
length($G) > 0 && \
$1 !~/^#/ {
sx1+= $F; sx2 += $F*$F;
sy1+= $G; sy2 += $G*$G;
sxy1+= $F*$G;
if( N==0 ) xmax = xmin = $F;
if( xmin > $F ) xmin=$F;
if( xmax < $F ) xmax=$F;
if( N==0 ) ymax = ymin = $G;
if( ymin > $G ) ymin=$G;
if( ymax < $G ) ymax=$G;
N++;
}

END {
printf("%d # N\n" ,N );
if (N <= 1)
{
printf("What's the point?\n");
exit 1;
}
printf("%g # xmin\n",xmin);
printf("%g # xmax\n",xmax);
printf("%g # xmean\n",xmean=sx1/N);
xSigma = sx2 - 2 * xmean * sx1+ N*xmean*xmean;
printf("%g # xvar\n" ,xvar =xSigma/ N );
printf("%g # xvar unbiased\n",xvaru=xSigma/(N-1));
printf("%g # xstddev\n" ,sqrt(xvar ));
printf("%g # xstddev unbiased\n",sqrt(xvaru));

printf("%g # ymin\n",ymin);
printf("%g # ymax\n",ymax);
printf("%g # ymean\n",ymean=sy1/N);
ySigma = sy2 - 2 * ymean * sy1+ N*ymean*ymean;
printf("%g # yvar\n" ,yvar =ySigma/ N );
printf("%g # yvar unbiased\n",yvaru=ySigma/(N-1));
printf("%g # ystddev\n" ,sqrt(yvar ));
printf("%g # ystddev unbiased\n",sqrt(yvaru));
if ( xSigma * ySigma <= 0 )
r=0;
else
r=(sxy1 - xmean*sy1- ymean * sx1+ N * xmean * ymean)
/sqrt(xSigma * ySigma);
printf("%g # correlation coefficient\n", r);
if( r > 1 || r < -1 )
printf("SERIOUS ERROR! CORRELATION COEFFICIENT");
printf(" OUTSIDE RANGE -1..1\n");

if( 1-r*r != 0 )
printf("%g # Student's T (use with N-2 degfreed)\n&", \
t=r*sqrt((N-2)/(1-r*r)) );
else
printf("0 # Correlation is perfect,");
printf(" Student's T is plus infinity\n");
b = (sxy1 - ymean * sx1)/(sx2 - xmean * sx1);
a = ymean - b * xmean;
ss=sy2 - 2*a*sy1- 2*b*sxy1 + N*a*a + 2*a*b*sx1+ b*b*sx2 ;
ss/= N-2;
printf("%g # a = y-intercept\n", a);
printf("%g # b = slope\n" , b);
printf("%g # s^2 = unbiased estimator for sigsq\n",ss);
printf("%g + %g * x # equation ready for cut-and-paste\n",a,b);
ra = sqrt(ss * sx2 / (N * xSigma));
rb = sqrt(ss / ( xSigma));
printf("%g # radius of confidence interval ");
printf("for a, multiply by t\n",ra);
printf("%g # radius of confidence interval ");
printf("for b, multiply by t\n",rb);
}
Copyright © Linux教程網 All Rights Reserved