Mercurial > hg > cc > cirrus_home
graph
-
support field edit2021-06-28, by Henry S. Thompson
-
for use in processing CC index files2021-06-28, by Henry S. Thompson
-
implement --cmd2021-06-16, by Henry S. Thompson
-
qpdf needs LD_LIB_PATH2021-06-16, by Henry S. Thompson
-
refactor final processing loop,2021-06-15, by Henry S. Thompson
-
frame size2021-06-15, by Henry S. Thompson
-
include sh-script2021-06-15, by Henry S. Thompson
-
all parts working, idempotency achieved2021-04-26, by Henry S. Thompson
-
debugging2021-04-26, by Henry S. Thompson
-
(none)2021-04-26, by Henry S. Thompson
-
warc and headers parts working2021-04-26, by Henry S. Thompson
-
back to IGzipFile2021-04-22, by Henry S. Thompson
-
approved Popen version using .communicate2021-04-22, by Henry S. Thompson
-
using Popen to run igzip (also not great)2021-04-22, by Henry S. Thompson
-
added support for copying to/using /dev/shm or /tmp2021-04-20, by Henry S. Thompson
-
working with -x and rich directory structure2021-04-20, by Henry S. Thompson
-
convert to rich directory structure per 2019-352021-04-20, by Henry S. Thompson
-
-x barely working2021-04-19, by Henry S. Thompson
-
never should have added2021-04-19, by Henry S. Thompson
-
better dd error handling2021-04-19, by Henry S. Thompson
-
(none)2021-04-19, by Henry S. Thompson
-
bare minimum working2021-04-18, by Henry S. Thompson
-
triple args checked, filename opened2021-04-16, by Henry S. Thompson
-
help format hacking done2021-04-16, by Henry S. Thompson
-
basic help format hacking works2021-04-16, by Henry S. Thompson
-
(none)2021-04-16, by Henry S. Thompson
-
(none)2021-04-16, by Henry S. Thompson
-
just strugging with argparse2021-04-15, by Henry S. Thompson
-
support a command to receive each result,2021-04-15, by Henry S. Thompson
-
accepts index lines, less line-at-a-time2021-04-14, by Henry S. Thompson
-
working with one input2021-04-14, by Henry S. Thompson
-
-w and -h working2021-04-14, by Henry S. Thompson
-
working on flags2021-04-13, by Henry S. Thompson
-
new2021-04-13, by Henry S. Thompson
-
working with locking and copying2021-03-16, by Henry S. Thompson
-
working for -t 2 -c 22021-03-15, by Henry S. Thompson
-
minor2021-03-15, by Henry S. Thompson
-
prepare for real parallel distribution2021-03-14, by Henry S. Thompson
-
environment improvements2021-03-14, by Henry S. Thompson
-
trying to move to slurm2021-03-03, by Henry S. Thompson
-
improved F handling/logging2020-05-09, by Henry S. Thompson
-
keep separate antecedants separate, buggy?2020-05-08, by Henry S. Thompson
-
track redirects, need to us full crawldiagnostics.warc.gz for "location:" and "Uri:"2020-05-07, by Henry S. Thompson
-
refactor, change summary print (problem?)2020-05-07, by Henry S. Thompson
-
bare framework working2020-05-06, by Henry S. Thompson
-
starting on tool to assemble as complete as we have info wrt a seed URI2020-05-06, by Henry S. Thompson
-
use local .m2/repository for Hadoop 3.4.02020-05-06, by Henry S. Thompson
-
works for big files with Hadoop 3.4.02020-05-06, by Henry S. Thompson
-
x2020-05-06, by Henry S. Thompson
-
log trucations2020-04-28, by Henry S. Thompson
-
impose some limits2020-04-28, by Henry S. Thompson
-
x2020-04-28, by Henry S. Thompson
-
x2020-04-24, by Henry S. Thompson
-
mostly from Sebastian2020-04-24, by Henry S. Thompson
-
misc2020-04-24, by Henry S. Thompson
-
misc2020-04-24, by Henry S. Thompson
-
fix from Sebastian2020-04-24, by Henry S. Thompson
-
misc2020-04-24, by Henry S. Thompson
-
misc2020-04-24, by Henry S. Thompson
-
several efficiency (hofentlich) tweaks2020-04-24, by Henry S. Thompson
-
x2020-04-23, by Henry S. Thompson
-
switch for use on login server, invoke by hand with 0/1 as only cmd line arg2020-04-23, by Henry S. Thompson
-
java stuff2020-04-22, by Henry S. Thompson
-
try nutch fetch for big pdfs2020-04-22, by Henry S. Thompson
-
final most general versin2020-04-15, by Henry S. Thompson
-
too big for /dev/shm, split in half2020-04-14, by Henry S. Thompson
-
one-off to convert big extracts.tar into lots of smaller ones2020-04-14, by Henry S. Thompson
-
as used successfully for 3rd run2020-04-13, by Henry S. Thompson
-
ready to try another pass with robust diff checking2020-04-13, by Henry S. Thompson
-
working towards more robust diff checking2020-04-13, by Henry S. Thompson
-
a few tweaks after 2nd parallel run2020-04-11, by Henry S. Thompson
-
another few log fixes2020-04-10, by Henry S. Thompson
-
as running, modulo 1 log output wrong2020-04-10, by Henry S. Thompson
-
log more, work around more glitches2020-04-10, by Henry S. Thompson
-
x2020-04-10, by Henry S. Thompson
-
start try to work around failures2020-04-08, by Henry S. Thompson
-
parallelised version of reExtract.sh2020-04-08, by Henry S. Thompson
-
added computation of required additions to tar file, but not actually added2020-04-04, by Henry S. Thompson
-
refactored, not tested2020-04-03, by Henry S. Thompson
-
done through re-extraction, fixing tars still to come2020-04-03, by Henry S. Thompson
-
sketching more2020-04-02, by Henry S. Thompson
-
towards re-running extraction in part2020-04-02, by Henry S. Thompson
-
up the time limit2020-04-02, by Henry S. Thompson
-
clean up after ourselves2020-04-02, by Henry S. Thompson
-
fixed scope pblm in tar step2020-03-26, by Henry S. Thompson
-
sync up filenames and log names,2020-03-26, by Henry S. Thompson
-
pass through extract args2020-03-26, by Henry S. Thompson
-
towards sub-division of resulting tar files2020-03-24, by Henry S. Thompson
-
not relevant2020-03-24, by Henry S. Thompson
-
x2020-03-19, by Henry S. Thompson
-
better quoting2020-03-19, by Henry S. Thompson
-
try to fix multi-line lossage2020-03-18, by Henry S. Thompson
-
fix missing use of $t2020-03-18, by Henry S. Thompson
-
first cut at doing extraction here2020-03-18, by Henry S. Thompson
-
finally hacked something that works2020-03-18, by Henry S. Thompson