Mercurial > hg > cc > cirrus_home
graph
-
add target test-core which (dangerously) avoids (we hope pointless) recompilation of all the plugins default tip5 months ago, by Henry S. Thompson
-
move DummyContext out5 months ago, by Henry S. Thompson
-
works, although output not checked5 months ago, by Henry S. Thompson
-
maybe triggers jdb on tests with -DdebugTest=true on command line5 months ago, by Henry S. Thompson
-
orig, more or less5 months ago, by Henry S. Thompson
-
working, with issues:5 months ago, by Henry S. Thompson
-
compiles with content, but fails with EOF -- need blank lines?5 months ago, by Henry S. Thompson
-
runs, but no cdx yet, because no value.content I presume5 months ago, by Henry S. Thompson
-
add lastmod to cdx lines,5 months ago, by Henry S. Thompson
-
csing-related tweaks12 months ago, by Henry S. Thompson
-
too many overdue updates to break down15 months ago, by Henry S. Thompson
-
use csing, and _runme_c.sh to get it initialised18 months ago, by Henry S. Thompson
-
MANPATH (?)18 months ago, by Henry S. Thompson
-
tab completion fix18 months ago, by Henry S. Thompson
-
add support for multiple calls to srun with a counter19 months ago, by Henry S. Thompson
-
add private work bin dir to PATH20 months ago, by Henry S. Thompson
-
tweak UI: copy/paste and title bar20 months ago, by Henry S. Thompson
-
ec184 now, run w. unbuffered output20 months ago, by Henry S. Thompson
-
moved to work tree20 months ago, by Henry S. Thompson
-
working, about to move to work tree20 months ago, by Henry S. Thompson
-
working on implementing types and parts:20 months ago, by Henry S. Thompson
-
change account back2023-01-10, by Henry S. Thompson
-
x2022-07-28, by Henry S. Thompson
-
generalised sbatch front-end to cdx2tsv.py2022-07-28, by Henry S. Thompson
-
x2022-07-28, by Henry S. Thompson
-
add $W2022-07-20, by Henry S. Thompson
-
new-style log notice2022-07-20, by Henry S. Thompson
-
x2022-07-20, by Henry S. Thompson
-
new style batch jobs, see cirrus_work repo for _xxx.sh2022-07-18, by Henry S. Thompson
-
old style2022-07-18, by Henry S. Thompson
-
symlink to dir does't work2022-07-18, by Henry S. Thompson
-
work-path bin dir2022-07-18, by Henry S. Thompson
-
previous approach to lang/field extraction2022-07-18, by Henry S. Thompson
-
moved to shared/bin2022-07-18, by Henry S. Thompson
-
x2022-07-18, by Henry S. Thompson
-
x2022-07-18, by Henry S. Thompson
-
demo of slurm usage using cdx2tsv.py2022-07-06, by Henry S. Thompson
-
do whole line2022-07-06, by Henry S. Thompson
-
no more gentoo,2022-07-04, by Henry S. Thompson
-
allow use of global stash2022-07-04, by Henry S. Thompson
-
for 2022 exercise2022-07-01, by Henry Thompson
-
instead of csv2021-11-17, by Henry S. Thompson
-
add -c switch to btot2021-11-01, by Henry S. Thompson
-
use sqlite3 just to tabulate2021-10-28, by Henry S. Thompson
-
fixed2021-10-26, by Henry S. Thompson
-
working, with compound driver files2021-10-26, by Henry S. Thompson
-
better comments2021-10-25, by Henry S. Thompson
-
do the work for cdx2sql2021-10-25, by Henry S. Thompson
-
change test to use Master2021-10-25, by Henry S. Thompson
-
works for 0--92021-10-22, by Henry S. Thompson
-
replace too-complex invocation of cdx2tsv2021-10-21, by Henry S. Thompson
-
basic, works2021-10-20, by Henry S. Thompson
-
too clever by half, keys won't work in parallel for e.g. media types2021-10-20, by Henry S. Thompson
-
working, w. pickle2021-10-19, by Henry S. Thompson
-
mail-lib2021-10-19, by Henry S. Thompson
-
move to ec164.guest2021-10-19, by Henry S. Thompson
-
fixed bug(s) wrt large payload files2021-07-23, by Henry S. Thompson
-
just barely working2021-07-23, by Henry S. Thompson
-
add cl arg --fpath replacing FPAT, which is now default value2021-07-21, by Henry S. Thompson
-
more paths2021-07-21, by Henry S. Thompson
-
add usage/help info2021-07-14, by Henry S. Thompson
-
add usage/help info2021-07-14, by Henry S. Thompson
-
parameterise the temp file and move it to /dev/shm2021-07-14, by Henry S. Thompson
-
sic2021-07-14, by Henry S. Thompson
-
use printf safely2021-07-09, by Henry S. Thompson
-
handle multiple L-M lines :-(2021-07-09, by Henry S. Thompson
-
improve error handling2021-07-09, by Henry S. Thompson
-
more focussed, better SLURM_... vars2021-07-09, by Henry S. Thompson
-
bits and pieces2021-06-29, by Henry S. Thompson
-
better btot2021-06-29, by Henry S. Thompson
-
extract Last Modified via cdx2021-06-28, by Henry S. Thompson
-
fix path to qpdf2021-06-28, by Henry S. Thompson
-
silently skip robotstxt2021-06-28, by Henry S. Thompson
-
workaround histcontrol2021-06-28, by Henry S. Thompson
-
support field edit2021-06-28, by Henry S. Thompson
-
for use in processing CC index files2021-06-28, by Henry S. Thompson
-
implement --cmd2021-06-16, by Henry S. Thompson
-
qpdf needs LD_LIB_PATH2021-06-16, by Henry S. Thompson
-
refactor final processing loop,2021-06-15, by Henry S. Thompson
-
frame size2021-06-15, by Henry S. Thompson
-
include sh-script2021-06-15, by Henry S. Thompson
-
all parts working, idempotency achieved2021-04-26, by Henry S. Thompson
-
debugging2021-04-26, by Henry S. Thompson
-
(none)2021-04-26, by Henry S. Thompson
-
warc and headers parts working2021-04-26, by Henry S. Thompson
-
back to IGzipFile2021-04-22, by Henry S. Thompson
-
approved Popen version using .communicate2021-04-22, by Henry S. Thompson
-
using Popen to run igzip (also not great)2021-04-22, by Henry S. Thompson
-
added support for copying to/using /dev/shm or /tmp2021-04-20, by Henry S. Thompson
-
working with -x and rich directory structure2021-04-20, by Henry S. Thompson
-
convert to rich directory structure per 2019-352021-04-20, by Henry S. Thompson
-
-x barely working2021-04-19, by Henry S. Thompson
-
never should have added2021-04-19, by Henry S. Thompson
-
better dd error handling2021-04-19, by Henry S. Thompson
-
(none)2021-04-19, by Henry S. Thompson
-
bare minimum working2021-04-18, by Henry S. Thompson
-
triple args checked, filename opened2021-04-16, by Henry S. Thompson
-
help format hacking done2021-04-16, by Henry S. Thompson
-
basic help format hacking works2021-04-16, by Henry S. Thompson
-
(none)2021-04-16, by Henry S. Thompson
-
(none)2021-04-16, by Henry S. Thompson
-
just strugging with argparse2021-04-15, by Henry S. Thompson
-
support a command to receive each result,2021-04-15, by Henry S. Thompson
-
accepts index lines, less line-at-a-time2021-04-14, by Henry S. Thompson
-
working with one input2021-04-14, by Henry S. Thompson
-
-w and -h working2021-04-14, by Henry S. Thompson
-
working on flags2021-04-13, by Henry S. Thompson
-
new2021-04-13, by Henry S. Thompson
-
working with locking and copying2021-03-16, by Henry S. Thompson
-
working for -t 2 -c 22021-03-15, by Henry S. Thompson
-
minor2021-03-15, by Henry S. Thompson
-
prepare for real parallel distribution2021-03-14, by Henry S. Thompson
-
environment improvements2021-03-14, by Henry S. Thompson
-
trying to move to slurm2021-03-03, by Henry S. Thompson
-
improved F handling/logging2020-05-09, by Henry S. Thompson
-
keep separate antecedants separate, buggy?2020-05-08, by Henry S. Thompson
-
track redirects, need to us full crawldiagnostics.warc.gz for "location:" and "Uri:"2020-05-07, by Henry S. Thompson
-
refactor, change summary print (problem?)2020-05-07, by Henry S. Thompson
-
bare framework working2020-05-06, by Henry S. Thompson
-
starting on tool to assemble as complete as we have info wrt a seed URI2020-05-06, by Henry S. Thompson
-
use local .m2/repository for Hadoop 3.4.02020-05-06, by Henry S. Thompson
-
works for big files with Hadoop 3.4.02020-05-06, by Henry S. Thompson
-
x2020-05-06, by Henry S. Thompson
-
log trucations2020-04-28, by Henry S. Thompson
-
impose some limits2020-04-28, by Henry S. Thompson
-
x2020-04-28, by Henry S. Thompson
-
x2020-04-24, by Henry S. Thompson
-
mostly from Sebastian2020-04-24, by Henry S. Thompson
-
misc2020-04-24, by Henry S. Thompson
-
misc2020-04-24, by Henry S. Thompson
-
fix from Sebastian2020-04-24, by Henry S. Thompson
-
misc2020-04-24, by Henry S. Thompson
-
misc2020-04-24, by Henry S. Thompson
-
several efficiency (hofentlich) tweaks2020-04-24, by Henry S. Thompson
-
x2020-04-23, by Henry S. Thompson
-
switch for use on login server, invoke by hand with 0/1 as only cmd line arg2020-04-23, by Henry S. Thompson
-
java stuff2020-04-22, by Henry S. Thompson
-
try nutch fetch for big pdfs2020-04-22, by Henry S. Thompson
-
final most general versin2020-04-15, by Henry S. Thompson
-
too big for /dev/shm, split in half2020-04-14, by Henry S. Thompson
-
one-off to convert big extracts.tar into lots of smaller ones2020-04-14, by Henry S. Thompson
-
as used successfully for 3rd run2020-04-13, by Henry S. Thompson
-
ready to try another pass with robust diff checking2020-04-13, by Henry S. Thompson
-
working towards more robust diff checking2020-04-13, by Henry S. Thompson
-
a few tweaks after 2nd parallel run2020-04-11, by Henry S. Thompson
-
another few log fixes2020-04-10, by Henry S. Thompson
-
as running, modulo 1 log output wrong2020-04-10, by Henry S. Thompson
-
log more, work around more glitches2020-04-10, by Henry S. Thompson
-
x2020-04-10, by Henry S. Thompson
-
start try to work around failures2020-04-08, by Henry S. Thompson
-
parallelised version of reExtract.sh2020-04-08, by Henry S. Thompson
-
added computation of required additions to tar file, but not actually added2020-04-04, by Henry S. Thompson
-
refactored, not tested2020-04-03, by Henry S. Thompson
-
done through re-extraction, fixing tars still to come2020-04-03, by Henry S. Thompson
-
sketching more2020-04-02, by Henry S. Thompson
-
towards re-running extraction in part2020-04-02, by Henry S. Thompson
-
up the time limit2020-04-02, by Henry S. Thompson
-
clean up after ourselves2020-04-02, by Henry S. Thompson
-
fixed scope pblm in tar step2020-03-26, by Henry S. Thompson
-
sync up filenames and log names,2020-03-26, by Henry S. Thompson
-
pass through extract args2020-03-26, by Henry S. Thompson
-
towards sub-division of resulting tar files2020-03-24, by Henry S. Thompson
-
not relevant2020-03-24, by Henry S. Thompson
-
x2020-03-19, by Henry S. Thompson
-
better quoting2020-03-19, by Henry S. Thompson
-
try to fix multi-line lossage2020-03-18, by Henry S. Thompson
-
fix missing use of $t2020-03-18, by Henry S. Thompson
-
first cut at doing extraction here2020-03-18, by Henry S. Thompson
-
finally hacked something that works2020-03-18, by Henry S. Thompson
-
(none)2020-03-18, by Henry S. Thompson
-
(none)2020-03-18, by Henry S. Thompson
-
x2020-03-18, by Henry S. Thompson
-
more job scripts2020-03-18, by Henry S. Thompson
-
more job scripts2020-03-18, by Henry S. Thompson
-
local setup2020-03-18, by Henry S. Thompson
-
copied from valhalla/bin2020-03-16, by Henry S. Thompson
-
fix a mis-folded link file2020-02-27, by Henry S. Thompson
-
sic2020-02-27, by Henry S. Thompson
-
use awk to do a join between links and 1132dates2020-02-26, by Henry S. Thompson
-
works after minor tweaks2020-02-26, by Henry S. Thompson
-
modelled on plinks2020-02-26, by Henry S. Thompson
-
fixes to pdfx to timeout, use regex2020-02-26, by Henry S. Thompson
-
add args for start tar and number of tars2020-02-25, by Henry S. Thompson
-
give up on mpiexec_mpt2020-02-25, by Henry S. Thompson
-
bigger run, longer limit2020-02-25, by Henry S. Thompson
-
logging tweaks, preparing for timeout on problem pdfs2020-02-25, by Henry S. Thompson
-
longer run, terser logging2020-02-24, by Henry S. Thompson
-
more logging2020-02-24, by Henry S. Thompson
-
refactor to address tarred-up pdfs2020-02-23, by Henry S. Thompson
-
merge2020-02-19, by Henry S. Thompson
-
try harder not to write empty links files2020-02-19, by Henry S. Thompson
-
only create links file if there are some2020-02-18, by Henry Thompson
-
typos2020-02-18, by Henry Thompson
-
switch to file loop inside python, assume file index integer in pipe as well as filename, check /dev/shm/stopJob2020-02-18, by Henry Thompson
-
bolting the barn door...2020-02-18, by Henry S. Thompson