Mercurial > hg > cc > azure
view workers/bin/_timedWhich.sh @ 43:c2b72d29a3ee
update to use _timedWhich.py
author | Henry S. Thompson <ht@markup.co.uk> |
---|---|
date | Fri, 30 Nov 2018 18:37:40 +0000 |
parents | d4f186655bcc |
children |
line wrap: on
line source
#!/bin/bash egrep -o '("WARC-Target-URI":"https?:|"Last-Modified":"[^"]*")'|\ egrep -o '(https?:|:".*"$)' |\ tr '\012' \# | sed 's/:#:/ /g'|tr \# '\012' | tr -d \"|\ sed ';s/gmt//ig;s/ [[:digit:]][[:digit:]]\?:[[:digit:]][[:digit:]]:[[:digit:]][[:digit:]]\(\.[[:digit:]]*\)\?\b//;s/^\(https\? \)\(: \)/\1/;s/ [MTWFSa-z]..\.\?, \?/ /;s/\( [[:upper:]][[:alnum:]]\{1,3\}\)\{1,2\}$//;s/ [-+][[:digit:]]\{4\}\b//;s/ [[:digit:]]\{1,2\} / /;s/ [[:upper:]][[:alnum:]]*\/[[:upper:]][[:alnum:]]*$//;s/ \+$//'|\ awk '{c[$0]+=1} END {for (k in c) {print k, c[k]}}'