log

age author description
Fri, 29 Sep 2023 15:13:51 +0100 Henry S. Thompson tweaks to get all tests through #14
Thu, 28 Sep 2023 18:31:23 +0100 Henry S. Thompson get 7f (two cases) and %25 working
Thu, 28 Sep 2023 18:30:48 +0100 Henry S. Thompson add televida case test
Thu, 28 Sep 2023 16:36:15 +0100 Henry S. Thompson add test description
Thu, 28 Sep 2023 16:35:39 +0100 Henry S. Thompson importable just in case
Thu, 28 Sep 2023 16:34:49 +0100 Henry S. Thompson move most of the hacking into fixGoogleCanon,
Thu, 28 Sep 2023 16:10:05 +0100 Henry S. Thompson forget assert, allow multiple failures
Thu, 28 Sep 2023 16:09:38 +0100 Henry S. Thompson x
Thu, 28 Sep 2023 14:08:36 +0100 Henry S. Thompson found right place for \x7f hack, maybe
Thu, 28 Sep 2023 14:06:11 +0100 Henry S. Thompson readability
Thu, 28 Sep 2023 11:00:36 +0100 Henry S. Thompson x
Thu, 28 Sep 2023 11:00:24 +0100 Henry S. Thompson refactor to sort a module in an lmh package
Thu, 28 Sep 2023 10:54:12 +0100 Henry S. Thompson start some regression tests
Thu, 28 Sep 2023 09:01:18 +0100 Henry S. Thompson creating lmh package
Thu, 28 Sep 2023 08:46:01 +0100 Henry S. Thompson moved from bin
Wed, 27 Sep 2023 17:29:51 +0100 Henry S. Thompson minor bug wrt EOF of final cdx input file
Wed, 27 Sep 2023 17:29:09 +0100 Henry S. Thompson replicate two extremely-corner cases of the way
Tue, 26 Sep 2023 18:55:43 +0100 Henry S. Thompson a bit more logging
Tue, 26 Sep 2023 18:55:11 +0100 Henry S. Thompson a bit more logging
Tue, 26 Sep 2023 17:42:57 +0100 Henry S. Thompson robotstxt and crawldiagnostics get free ride,
Tue, 26 Sep 2023 14:18:40 +0100 Henry S. Thompson a few more from ecclerig,
Tue, 26 Sep 2023 09:03:47 +0100 Henry S. Thompson refactor datestream reading,
Mon, 25 Sep 2023 23:53:13 +0100 Henry S. Thompson more faithful regexps and non-byte uri output
Fri, 22 Sep 2023 15:27:28 +0100 Henry S. Thompson one uncommited fix from quentin
Tue, 19 Sep 2023 19:40:58 +0100 Henry Thompson pass in debug flag(s) to merge_date.py
Tue, 19 Sep 2023 19:29:41 +0100 Henry Thompson loosen must-match criterion in the both-messy case
Tue, 19 Sep 2023 19:28:34 +0100 Henry Thompson one more sid fix,
Sun, 17 Sep 2023 15:18:11 +0100 Henry S. Thompson working on sessionID pblms, still
Thu, 14 Sep 2023 19:27:23 +0100 Henry Thompson first try
Wed, 13 Sep 2023 16:48:43 +0100 Henry S. Thompson switch to gzip -7 to get comparable compressed cdx block size
Wed, 13 Sep 2023 12:41:55 +0100 Henry S. Thompson use my own Canonicalizer to fix more obscure
Wed, 13 Sep 2023 12:40:39 +0100 Henry S. Thompson re-instate logging splits for .idx
Tue, 12 Sep 2023 12:14:04 +0100 Henry S. Thompson reinstate better check to start queuing,
Mon, 11 Sep 2023 22:06:45 +0100 Henry S. Thompson bug4 fixed, but that created a new, earlier bug
Mon, 11 Sep 2023 12:56:47 +0100 Henry S. Thompson rework handling of session key problem
Fri, 08 Sep 2023 21:40:52 +0100 Henry S. Thompson initialise paths for csing
Fri, 08 Sep 2023 21:40:06 +0100 Henry S. Thompson d'oh
Fri, 08 Sep 2023 18:06:54 +0100 Henry S. Thompson include full URI in output
Fri, 08 Sep 2023 18:05:57 +0100 Henry S. Thompson try to do csing correctly on compute nodes
Fri, 08 Sep 2023 09:29:25 +0100 Henry S. Thompson version which outputs more identification,
Thu, 07 Sep 2023 18:03:55 +0100 Henry S. Thompson last version before giving up on approach based only on key and datestamp
Wed, 06 Sep 2023 18:51:21 +0100 Henry S. Thompson improve reordering, still failing on cdx-00004
Tue, 05 Sep 2023 17:33:29 +0100 Henry S. Thompson attempt at reordering if necessary
Tue, 05 Sep 2023 17:32:46 +0100 Henry S. Thompson mostly working, but need to reorder in case of cfid and friends
Thu, 31 Aug 2023 14:14:21 +0100 Henry S. Thompson flip loops
Wed, 30 Aug 2023 21:49:43 +0100 Henry S. Thompson merge a stream of ks files with a set of cdx files
Wed, 30 Aug 2023 11:11:31 +0100 Henry S. Thompson final keystroke fixes, recurse and decimal www stripping
Wed, 30 Aug 2023 11:10:54 +0100 Henry S. Thompson final keystroke fixes,
Mon, 28 Aug 2023 21:07:43 +0100 Henry S. Thompson handle double .www, more keep-me chars
Thu, 24 Aug 2023 18:21:41 +0100 Henry S. Thompson work-around for weird handling of %-encoding in Java impl. of SURT
Mon, 21 Aug 2023 13:06:20 -0400 Henry Thompson merge, including pointless fix wrt pq
Sat, 19 Aug 2023 16:33:23 -0400 Henry Thompson use surt instead of trying to create index term by hand
Sat, 19 Aug 2023 16:02:29 -0400 Henry Thompson merge
Sat, 19 Aug 2023 15:58:38 -0400 Henry Thompson stale
Sat, 19 Aug 2023 15:53:59 -0400 Henry Thompson catching up by hand with markup version,
Mon, 21 Aug 2023 13:37:07 +0100 Henry S. Thompson include timestamp
Sun, 20 Aug 2023 00:28:43 +0100 Henry S. Thompson include query
Fri, 18 Aug 2023 18:25:54 +0100 Henry S. Thompson make CC's own sorting explicit
Thu, 10 Aug 2023 22:14:49 +0100 Henry S. Thompson handle corner cases with final . and initial www..+
Wed, 09 Aug 2023 02:01:32 +0100 Henry S. Thompson handle %-encoded utf-8 as idna