annotate workers/bin/_timedWhich.py @ 40:4cf6bc21f683

start work on python version of tW.sh
author Henry S. Thompson <ht@markup.co.uk>
date Fri, 30 Nov 2018 13:43:36 +0000
parents
children 1d776e96c16a
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
40
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
1 #!/usr/bin/env python3
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
2 import re,sys,io
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
3
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
4 uin=io.TextIOWrapper(sys.stdin.buffer,encoding='latin1')
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
5 p1=re.compile('"WARC-Target-URI":"(https?):.*msgtype=response')
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
6 p2=re.compile('"Last-Modified":"([^"]*)"')
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
7 w={}
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
8 wo={}
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
9 for l in uin:
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
10 m=p1.search(l)
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
11 if m:
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
12 k=m.group(1)
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
13 m=p2.search(l,m.end())
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
14 if m is None:
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
15 wo[k]=wo.get(k,0)+1
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
16 else:
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
17 w[k]=w.get(k,0)+1
4cf6bc21f683 start work on python version of tW.sh
Henry S. Thompson <ht@markup.co.uk>
parents:
diff changeset
18 print("with %s\nw/o %s"%(w,wo))