|
打开要下载的网页的源代码,用grep命令抓下所有联接,再用如gvim编辑加工成文件address,内容比如为:
/iel5/9112/28901/01300945.pdf?tp=&arnumber=1300945&isnumber=28901
/iel5/8570/27140/01205864.pdf?tp=&arnumber=1205864&isnumber=27140
/iel5/8570/27140/01205780.pdf?tp=&arnumber=1205780&isnumber=27140
/iel5/8535/26975/01199484.pdf?tp=&arnumber=1199484&isnumber=26975
/iel5/8091/22457/01048150.pdf?tp=&arnumber=1048150&isnumber=22457
执行脚本bot:
domain=http://ieeexplore.ieee.org;
for i in `less address`;
do
echo $i>a;
for j in `sed 's/\/iel5.*\///' a`;
do
if [ ! -f $j ];
then proz -r -1 "$domain$i";
sleep 15;
fi;
done;
done
就可以了
下载后的文件形如:
01300945.pdf?tp=&arnumber=1300945&isnumber=28901 |
|