|
发表于 2004-4-11 14:37:33
|
显示全部楼层
Gawk的实现
开估了~~~
给出gawk脚本的最终版本。
除了满足楼主所有要求之外,还有以下特色:
它能适应"域间分布不均匀"(一个域以PARSING_FAILED或者其他字符串结尾)的可能性,并且巧妙地去掉头尾空行(详见运行结果)。
- [root@home root]# cat myawk6
- #!/bin/gawk
- {
- if($0 ~ /^1.*\(.*\)$/)
- flag=1
- if(flag==1 && $0 !~ /^$/){
- line[++i]=$0
- }
- if($0 == "### END EVENT ###"){
- endflag=1
- next
- }
- if(endflag==1){
- if($0 == "PARSING_FAILED"){
- cnt++
- if(cnt>=2) print ""
- for(j=1;j<=i;j++){
- print line[j]
- delete line[j]
- }
- }
- else{
- for(j=1;j<=i;j++){
- delete line[j]
- }
- }
- i=0
- flag=0
- endflag=0
- }
- }
复制代码
为了验证脚本的强壮性,不妨先给出楼主的变化文本:
---------------------------------------------------------------
1~906072~458748239(Apr 08 time 2004)
### EVENT ###
kdieiejr
jkjviej
### END EVENT ###
PARSING_FAILED
1~906071~213435245(Apr 08 time 2004)
### EVENT ###
aksldlamalsdf
asdjalskdfjal
### END EVENT ###
PROCESSED
1~906072~458748239(Apr 08 time 2004)
### EVENT ###
kdieiejr
jkjviej
### END EVENT ###
PARSING_FAILED
1~906073~8782293(Apr 08 time 2004)
### EVENT ###
ierueio
dkjfire
### END EVENT ###
PARSING_FAILED
1~906071~213435245(Apr 08 time 2004)
### EVENT ###
aksldlamalsdf
asdjalskdfjal
### END EVENT ###
PROCESSED
----------------------------------------------------------------
以上就是可能存在的"域间分布不均匀性"。
请看运行结果:
-------------------------------------------------------------------
[root@home root]# gawk -f myawk6 tmpdata
1~906072~458748239(Apr 08 time 2004)
### EVENT ###
kdieiejr
jkjviej
### END EVENT ###
PARSING_FAILED
1~906072~458748239(Apr 08 time 2004)
### EVENT ###
kdieiejr
jkjviej
### END EVENT ###
PARSING_FAILED
1~906073~8782293(Apr 08 time 2004)
### EVENT ###
ierueio
dkjfire
### END EVENT ###
PARSING_FAILED
-------------------------------------------------------------------
awk与C类似,具有很多高级语言的特性,这个脚本我利用了C常见的"标记"(flag)技术。:cool: |
|