log

age author description
Thu, 26 Sep 2024 17:55:56 +0100 Henry S. Thompson add target test-core which (dangerously) avoids (we hope pointless) recompilation of all the plugins default tip
Wed, 25 Sep 2024 17:45:52 +0100 Henry S. Thompson move DummyContext out
Wed, 25 Sep 2024 13:52:42 +0100 Henry S. Thompson works, although output not checked
Wed, 25 Sep 2024 13:51:15 +0100 Henry S. Thompson maybe triggers jdb on tests with -DdebugTest=true on command line
Wed, 25 Sep 2024 09:49:12 +0100 Henry S. Thompson orig, more or less
Tue, 24 Sep 2024 17:08:05 +0100 Henry S. Thompson working, with issues:
Tue, 24 Sep 2024 12:34:51 +0100 Henry S. Thompson compiles with content, but fails with EOF -- need blank lines?
Mon, 23 Sep 2024 19:18:36 +0100 Henry S. Thompson runs, but no cdx yet, because no value.content I presume
Mon, 23 Sep 2024 16:35:22 +0100 Henry S. Thompson add lastmod to cdx lines,
Thu, 15 Feb 2024 22:31:43 +0000 Henry S. Thompson csing-related tweaks
Wed, 06 Dec 2023 13:38:58 +0000 Henry S. Thompson too many overdue updates to break down
Fri, 08 Sep 2023 21:44:48 +0100 Henry S. Thompson use csing, and _runme_c.sh to get it initialised
Fri, 08 Sep 2023 21:42:55 +0100 Henry S. Thompson MANPATH (?)
Fri, 08 Sep 2023 21:42:12 +0100 Henry S. Thompson tab completion fix
Fri, 21 Jul 2023 11:38:20 +0100 Henry S. Thompson add support for multiple calls to srun with a counter
Wed, 05 Jul 2023 15:08:59 +0100 Henry S. Thompson add private work bin dir to PATH
Wed, 05 Jul 2023 15:07:51 +0100 Henry S. Thompson tweak UI: copy/paste and title bar
Wed, 05 Jul 2023 15:02:53 +0100 Henry S. Thompson ec184 now, run w. unbuffered output
Wed, 05 Jul 2023 14:52:00 +0100 Henry S. Thompson moved to work tree
Wed, 05 Jul 2023 14:50:00 +0100 Henry S. Thompson working, about to move to work tree
Mon, 03 Jul 2023 18:16:14 +0100 Henry S. Thompson working on implementing types and parts:
Tue, 10 Jan 2023 17:48:26 +0000 Henry S. Thompson change account back
Thu, 28 Jul 2022 17:25:09 +0100 Henry S. Thompson x
Thu, 28 Jul 2022 17:24:29 +0100 Henry S. Thompson generalised sbatch front-end to cdx2tsv.py
Thu, 28 Jul 2022 15:33:21 +0100 Henry S. Thompson x
Wed, 20 Jul 2022 19:48:11 +0100 Henry S. Thompson add $W
Wed, 20 Jul 2022 19:47:21 +0100 Henry S. Thompson new-style log notice
Wed, 20 Jul 2022 19:46:51 +0100 Henry S. Thompson x
Mon, 18 Jul 2022 19:16:20 +0100 Henry S. Thompson new style batch jobs, see cirrus_work repo for _xxx.sh
Mon, 18 Jul 2022 19:15:20 +0100 Henry S. Thompson old style
Mon, 18 Jul 2022 18:40:12 +0100 Henry S. Thompson symlink to dir does't work
Mon, 18 Jul 2022 18:30:56 +0100 Henry S. Thompson work-path bin dir
Mon, 18 Jul 2022 18:16:27 +0100 Henry S. Thompson previous approach to lang/field extraction
Mon, 18 Jul 2022 18:11:46 +0100 Henry S. Thompson moved to shared/bin
Mon, 18 Jul 2022 17:59:43 +0100 Henry S. Thompson x
Mon, 18 Jul 2022 17:39:35 +0100 Henry S. Thompson x
Wed, 06 Jul 2022 18:07:34 +0100 Henry S. Thompson demo of slurm usage using cdx2tsv.py
Wed, 06 Jul 2022 18:00:53 +0100 Henry S. Thompson do whole line
Mon, 04 Jul 2022 18:14:41 +0100 Henry S. Thompson no more gentoo,
Mon, 04 Jul 2022 18:12:26 +0100 Henry S. Thompson allow use of global stash
Fri, 01 Jul 2022 17:50:06 +0200 Henry Thompson for 2022 exercise
Wed, 17 Nov 2021 18:26:33 +0000 Henry S. Thompson instead of csv
Mon, 01 Nov 2021 21:23:13 +0000 Henry S. Thompson add -c switch to btot
Thu, 28 Oct 2021 12:11:08 +0000 Henry S. Thompson use sqlite3 just to tabulate
Tue, 26 Oct 2021 14:07:34 +0000 Henry S. Thompson fixed
Tue, 26 Oct 2021 14:05:35 +0000 Henry S. Thompson working, with compound driver files
Mon, 25 Oct 2021 15:07:03 +0000 Henry S. Thompson better comments
Mon, 25 Oct 2021 15:05:46 +0000 Henry S. Thompson do the work for cdx2sql
Mon, 25 Oct 2021 15:05:25 +0000 Henry S. Thompson change test to use Master
Fri, 22 Oct 2021 12:36:15 +0000 Henry S. Thompson works for 0--9
Thu, 21 Oct 2021 19:18:47 +0000 Henry S. Thompson replace too-complex invocation of cdx2tsv
Wed, 20 Oct 2021 17:14:18 +0000 Henry S. Thompson basic, works
Wed, 20 Oct 2021 15:47:55 +0000 Henry S. Thompson too clever by half, keys won't work in parallel for e.g. media types
Tue, 19 Oct 2021 12:57:50 +0000 Henry S. Thompson working, w. pickle
Tue, 19 Oct 2021 12:56:14 +0000 Henry S. Thompson mail-lib
Tue, 19 Oct 2021 12:55:30 +0000 Henry S. Thompson move to ec164.guest
Fri, 23 Jul 2021 22:19:15 +0000 Henry S. Thompson fixed bug(s) wrt large payload files
Fri, 23 Jul 2021 16:23:46 +0000 Henry S. Thompson just barely working
Wed, 21 Jul 2021 20:05:42 +0000 Henry S. Thompson add cl arg --fpath replacing FPAT, which is now default value
Wed, 21 Jul 2021 20:04:11 +0000 Henry S. Thompson more paths
Wed, 14 Jul 2021 16:50:30 +0000 Henry S. Thompson add usage/help info
Wed, 14 Jul 2021 16:49:54 +0000 Henry S. Thompson add usage/help info
Wed, 14 Jul 2021 16:49:35 +0000 Henry S. Thompson parameterise the temp file and move it to /dev/shm
Wed, 14 Jul 2021 15:30:29 +0000 Henry S. Thompson sic
Fri, 09 Jul 2021 14:20:51 +0000 Henry S. Thompson use printf safely
Fri, 09 Jul 2021 13:46:10 +0000 Henry S. Thompson handle multiple L-M lines :-(
Fri, 09 Jul 2021 13:45:43 +0000 Henry S. Thompson improve error handling
Fri, 09 Jul 2021 13:45:04 +0000 Henry S. Thompson more focussed, better SLURM_... vars
Tue, 29 Jun 2021 08:00:40 +0000 Henry S. Thompson bits and pieces
Tue, 29 Jun 2021 07:53:47 +0000 Henry S. Thompson better btot
Mon, 28 Jun 2021 21:50:30 +0000 Henry S. Thompson extract Last Modified via cdx
Mon, 28 Jun 2021 17:16:34 +0000 Henry S. Thompson fix path to qpdf
Mon, 28 Jun 2021 17:16:15 +0000 Henry S. Thompson silently skip robotstxt
Mon, 28 Jun 2021 17:15:19 +0000 Henry S. Thompson workaround histcontrol
Mon, 28 Jun 2021 15:40:10 +0000 Henry S. Thompson support field edit
Mon, 28 Jun 2021 14:01:41 +0000 Henry S. Thompson for use in processing CC index files
Wed, 16 Jun 2021 16:12:46 +0000 Henry S. Thompson implement --cmd
Wed, 16 Jun 2021 16:12:16 +0000 Henry S. Thompson qpdf needs LD_LIB_PATH
Tue, 15 Jun 2021 18:04:34 +0000 Henry S. Thompson refactor final processing loop,
Tue, 15 Jun 2021 16:58:31 +0000 Henry S. Thompson frame size
Tue, 15 Jun 2021 16:58:03 +0000 Henry S. Thompson include sh-script
Mon, 26 Apr 2021 17:18:29 +0000 Henry S. Thompson all parts working, idempotency achieved
Mon, 26 Apr 2021 17:17:58 +0000 Henry S. Thompson debugging
Mon, 26 Apr 2021 17:17:38 +0000 Henry S. Thompson (none)
Mon, 26 Apr 2021 15:28:23 +0000 Henry S. Thompson warc and headers parts working
Thu, 22 Apr 2021 21:31:03 +0000 Henry S. Thompson back to IGzipFile
Thu, 22 Apr 2021 21:10:02 +0000 Henry S. Thompson approved Popen version using .communicate
Thu, 22 Apr 2021 19:06:55 +0000 Henry S. Thompson using Popen to run igzip (also not great)
Tue, 20 Apr 2021 19:11:57 +0000 Henry S. Thompson added support for copying to/using /dev/shm or /tmp
Tue, 20 Apr 2021 12:26:09 +0000 Henry S. Thompson working with -x and rich directory structure
Tue, 20 Apr 2021 11:12:35 +0000 Henry S. Thompson convert to rich directory structure per 2019-35
Mon, 19 Apr 2021 18:09:51 +0000 Henry S. Thompson -x barely working
Mon, 19 Apr 2021 18:09:25 +0000 Henry S. Thompson never should have added
Mon, 19 Apr 2021 13:08:16 +0000 Henry S. Thompson better dd error handling
Mon, 19 Apr 2021 13:07:58 +0000 Henry S. Thompson (none)
Sun, 18 Apr 2021 17:03:45 +0000 Henry S. Thompson bare minimum working
Fri, 16 Apr 2021 18:28:00 +0000 Henry S. Thompson triple args checked, filename opened
Fri, 16 Apr 2021 13:15:23 +0000 Henry S. Thompson help format hacking done
Fri, 16 Apr 2021 12:55:05 +0000 Henry S. Thompson basic help format hacking works
Fri, 16 Apr 2021 09:01:16 +0000 Henry S. Thompson (none)
Fri, 16 Apr 2021 09:00:17 +0000 Henry S. Thompson (none)
Thu, 15 Apr 2021 19:22:27 +0000 Henry S. Thompson just strugging with argparse
Thu, 15 Apr 2021 10:59:25 +0000 Henry S. Thompson support a command to receive each result,
Wed, 14 Apr 2021 20:15:32 +0000 Henry S. Thompson accepts index lines, less line-at-a-time
Wed, 14 Apr 2021 10:08:41 +0000 Henry S. Thompson working with one input
Wed, 14 Apr 2021 08:57:43 +0000 Henry S. Thompson -w and -h working
Tue, 13 Apr 2021 17:52:31 +0000 Henry S. Thompson working on flags
Tue, 13 Apr 2021 17:02:09 +0000 Henry S. Thompson new
Tue, 16 Mar 2021 16:20:02 +0000 Henry S. Thompson working with locking and copying
Mon, 15 Mar 2021 14:26:42 +0000 Henry S. Thompson working for -t 2 -c 2
Mon, 15 Mar 2021 14:20:00 +0000 Henry S. Thompson minor
Sun, 14 Mar 2021 21:28:02 +0000 Henry S. Thompson prepare for real parallel distribution
Sun, 14 Mar 2021 21:25:01 +0000 Henry S. Thompson environment improvements
Wed, 03 Mar 2021 19:33:56 +0000 Henry S. Thompson trying to move to slurm
Sat, 09 May 2020 16:16:28 +0100 Henry S. Thompson improved F handling/logging
Fri, 08 May 2020 19:52:36 +0100 Henry S. Thompson keep separate antecedants separate, buggy?
Thu, 07 May 2020 18:47:24 +0100 Henry S. Thompson track redirects, need to us full crawldiagnostics.warc.gz for "location:" and "Uri:"
Thu, 07 May 2020 11:33:24 +0100 Henry S. Thompson refactor, change summary print (problem?)
Wed, 06 May 2020 18:28:52 +0100 Henry S. Thompson bare framework working
Wed, 06 May 2020 14:25:44 +0100 Henry S. Thompson starting on tool to assemble as complete as we have info wrt a seed URI