Mercurial > hg > cc > azure
comparison workers/bin/_timedWhich.sh @ 19:d4f186655bcc
lots of tweaking, reached the 80/20 point
author | Henry S. Thompson <ht@markup.co.uk> |
---|---|
date | Sat, 20 Oct 2018 16:11:29 +0000 |
parents | 9631fca89cc6 |
children |
comparison
equal
deleted
inserted
replaced
18:9631fca89cc6 | 19:d4f186655bcc |
---|---|
1 #!/bin/bash | 1 #!/bin/bash |
2 egrep -o '("WARC-Target-URI":"https?:|"Last-Modified":"[^"]*")'|\ | 2 egrep -o '("WARC-Target-URI":"https?:|"Last-Modified":"[^"]*")'|\ |
3 egrep -o '(https?:|:".*"$)' |\ | 3 egrep -o '(https?:|:".*"$)' |\ |
4 tr '\012' \# | sed 's/:#:/ /g'|tr \# '\012' | tr -d \"|\ | 4 tr '\012' \# | sed 's/:#:/ /g'|tr \# '\012' | tr -d \"|\ |
5 sed 's/ [[:digit:]][[:digit:]]\?:[[:digit:]][[:digit:]]:[[:digit:]][[:digit:]] / /;s/\(https\? \)\(: \)\?[MTWFSa-z]..\.\?, \?/\1/;s/ \([-+][[:digit:]]\{4\}\|[[:upper:]]\{2,3\}\)$//;s/ [[:digit:]]\{1,2\} / /;s/\/[[:digit:]]\{1,2\}\/\([[:digit:]]\{4\}\)$/ \1/'|\ | 5 sed ';s/gmt//ig;s/ [[:digit:]][[:digit:]]\?:[[:digit:]][[:digit:]]:[[:digit:]][[:digit:]]\(\.[[:digit:]]*\)\?\b//;s/^\(https\? \)\(: \)/\1/;s/ [MTWFSa-z]..\.\?, \?/ /;s/\( [[:upper:]][[:alnum:]]\{1,3\}\)\{1,2\}$//;s/ [-+][[:digit:]]\{4\}\b//;s/ [[:digit:]]\{1,2\} / /;s/ [[:upper:]][[:alnum:]]*\/[[:upper:]][[:alnum:]]*$//;s/ \+$//'|\ |
6 awk '{c[$0]+=1} END {for (k in c) {print k, c[k]}}' | 6 awk '{c[$0]+=1} END {for (k in c) {print k, c[k]}}' |
7 | 7 |