log

age author description
Mon, 05 May 2025 20:57:46 +0100 Henry S. Thompson robotstxt now working? default
Mon, 05 May 2025 20:57:30 +0100 Henry S. Thompson add another digit or two (segment #) to key for r_t
Mon, 05 May 2025 20:39:16 +0100 Henry S. Thompson better font
Wed, 23 Apr 2025 11:03:48 +0100 Henry S. Thompson still hacking var bindings...
Tue, 22 Apr 2025 14:32:07 +0100 Henry S. Thompson job index arg to doit, slightly better diagnostic output
Fri, 18 Apr 2025 13:39:55 +0100 Henry S. Thompson extend, then fix, to get it working for crawldiagnostics warc files
Wed, 09 Apr 2025 20:42:29 +0100 Henry S. Thompson fix another long-tail bug
Wed, 09 Apr 2025 17:15:40 +0100 Henry S. Thompson accommodate to change to digits for record type,
Wed, 09 Apr 2025 12:57:50 +0100 Henry S. Thompson simple refill working?
Wed, 09 Apr 2025 11:15:14 +0100 Henry S. Thompson try simpler refill
Tue, 08 Apr 2025 16:06:33 +0100 Henry S. Thompson park that, try fixed large buffer and large-enough min to ensure we always have a whole record in view
Mon, 07 Apr 2025 16:34:31 +0100 Henry S. Thompson in the midst of trying to rethink the refill logic
Mon, 24 Mar 2025 14:30:32 +0000 Henry S. Thompson trying to recover from partial, not-ordered, run of segs 0--7
Sat, 08 Mar 2025 22:31:14 +0000 Henry S. Thompson fix GMT fix,
Fri, 07 Mar 2025 21:17:47 +0000 Henry S. Thompson try to do the whole thing in one go
Fri, 07 Mar 2025 18:15:41 +0000 Henry S. Thompson type decls, cythonize works
Fri, 07 Mar 2025 15:39:36 +0000 Henry S. Thompson type decls, cythonize works
Wed, 05 Mar 2025 23:29:25 +0000 Henry S. Thompson automate a cdb chain
Thu, 27 Feb 2025 18:23:31 +0000 Henry S. Thompson move final report to stderr
Thu, 27 Feb 2025 18:23:05 +0000 Henry S. Thompson work with cdb logging, not sure why it was necessary
Wed, 26 Feb 2025 19:52:22 +0000 Henry S. Thompson parameterise the range of cdbs and segments,
Wed, 19 Feb 2025 17:49:31 +0000 Henry S. Thompson push value printing into C,
Wed, 19 Feb 2025 17:48:11 +0000 Henry S. Thompson try piping instead of python.isal,
Wed, 19 Feb 2025 17:46:24 +0000 Henry S. Thompson trivial test, suitable for gdb
Wed, 12 Feb 2025 20:17:39 +0000 Henry S. Thompson working, but very slowly
Wed, 12 Feb 2025 13:01:05 +0000 Henry S. Thompson maybe ready
Wed, 12 Feb 2025 12:59:28 +0000 Henry S. Thompson renamed
Wed, 12 Feb 2025 11:29:41 +0000 Henry S. Thompson towards a real test of cdb
Tue, 11 Feb 2025 11:25:44 +0000 Henry S. Thompson convert most CCdb methods to cpdef
Tue, 04 Feb 2025 11:17:13 +0000 Henry S. Thompson don't use print. Working
Tue, 04 Feb 2025 11:16:12 +0000 Henry S. Thompson align with change to non-static Cdb.
Tue, 04 Feb 2025 11:13:59 +0000 Henry S. Thompson align with non-static Cdb, add raw access for debugging
Mon, 03 Feb 2025 23:12:55 +0000 Henry S. Thompson running but not working
Mon, 03 Feb 2025 19:16:20 +0000 Henry S. Thompson Test for having multiple cdbs open at once: compiles
Fri, 31 Jan 2025 13:31:02 +0000 Henry S. Thompson use cdb library directly,
Fri, 31 Jan 2025 13:28:09 +0000 Henry S. Thompson use cdb library directly
Mon, 27 Jan 2025 21:19:18 +0000 Henry S. Thompson works with big (ks_0-9.60.cdb) cdb file
Fri, 24 Jan 2025 15:07:00 +0000 Henry S. Thompson finally get test code separated from db.pyx to work
Fri, 24 Jan 2025 15:04:41 +0000 Henry S. Thompson cython header file for db.pyx
Fri, 24 Jan 2025 15:02:57 +0000 Henry S. Thompson remove the testing code, leaving just the class
Fri, 24 Jan 2025 15:01:42 +0000 Henry S. Thompson prepare a ks..tsv file for indexing into a cdb
Thu, 23 Jan 2025 12:53:28 +0000 Henry S. Thompson renamed cpython class Cdb to CCdb to avoid name conflict with cdb.Cdb
Thu, 23 Jan 2025 12:27:57 +0000 Henry S. Thompson work with libcdb.a
Sat, 18 Jan 2025 23:00:30 +0000 Henry S. Thompson value from memory view working
Sat, 18 Jan 2025 21:25:17 +0000 Henry S. Thompson try using cdb as C library
Fri, 17 Jan 2025 20:37:10 +0000 Henry S. Thompson add some cython decoration, not much effect
Fri, 17 Jan 2025 20:35:21 +0000 Henry S. Thompson run with login shell
Fri, 17 Jan 2025 20:34:32 +0000 Henry S. Thompson tweak XEmacs font/key bindings
Fri, 17 Jan 2025 19:58:04 +0000 Henry S. Thompson tweak XEmacs font
Thu, 02 Jan 2025 18:35:08 +0000 Henry S. Thompson time the unpickling
Thu, 02 Jan 2025 18:30:03 +0000 Henry S. Thompson with bloom prefilter
Thu, 02 Jan 2025 14:52:14 +0000 Henry S. Thompson try adding lm to existing index from ks_0-9
Thu, 02 Jan 2025 14:51:00 +0000 Henry S. Thompson output bytes, pickle and save dict if -p, trim lm value to int
Wed, 01 Jan 2025 23:02:35 +0000 Henry S. Thompson test big dict for associating lm timestamp with cc timestamp+uri
Thu, 03 Oct 2024 18:17:55 +0100 Henry S. Thompson working together works well to provide what's needed to update a cdx to include lastmod where possible
Wed, 02 Oct 2024 19:54:45 +0100 Henry S. Thompson make into a library, entry point def unpackz(infileName, callback, outfile = None),
Wed, 02 Oct 2024 11:09:58 +0100 Henry S. Thompson cleaned up indentation to 2 spaces throughout
Wed, 02 Oct 2024 09:56:37 +0100 Henry S. Thompson take bufsize from cmdline
Tue, 01 Oct 2024 15:59:26 +0100 Henry S. Thompson eof pblms fixed, seems to work
Sat, 28 Sep 2024 15:19:05 +0100 Henry S. Thompson working, but last count/offset not being written