Sat, 18 Jan 2025 21:33:00 +0000 |
Henry S. Thompson |
progress with cdb
default tip
|
Mon, 06 Jan 2025 17:59:20 +0000 |
Henry S. Thompson |
comparing profiles w and w/o cython
|
Fri, 03 Jan 2025 13:35:14 +0000 |
Henry S. Thompson |
tried cdb, slower by 2 OoM
|
Thu, 02 Jan 2025 18:55:11 +0000 |
Henry S. Thompson |
tried pre-filtering with bloom, not much benefit if any
|
Thu, 02 Jan 2025 15:01:48 +0000 |
Henry S. Thompson |
using python dict test
|
Wed, 01 Jan 2025 23:03:07 +0000 |
Henry S. Thompson |
python dict testing
|
Wed, 01 Jan 2025 15:11:09 +0000 |
Henry S. Thompson |
pybloomfilter testing
|
Tue, 17 Dec 2024 21:25:28 +0000 |
Henry S. Thompson |
minor updates
|
Tue, 22 Oct 2024 14:00:33 +0100 |
Henry S. Thompson |
xxx
|
Tue, 15 Oct 2024 16:06:27 +0100 |
Henry S. Thompson |
merge
|
Tue, 15 Oct 2024 16:01:32 +0100 |
Henry S. Thompson |
for ILCC seminar paper
|
Fri, 11 Oct 2024 16:41:32 +0100 |
Henry S. Thompson |
detailed consistency check with 7 segments from published lmh-augmented cdx
|
Thu, 10 Oct 2024 17:44:58 +0100 |
Henry S. Thompson |
prelim consistency check with published lmh-augmented cdx
|
Wed, 09 Oct 2024 22:55:27 +0100 |
Henry S. Thompson |
done cdx_aux for segments 49--55 of 2019-35
|
Wed, 09 Oct 2024 09:43:07 +0100 |
Henry S. Thompson |
all of 49?
|
Fri, 04 Oct 2024 21:41:53 +0100 |
Henry S. Thompson |
tentative plan for merging
|
Fri, 04 Oct 2024 15:24:00 +0100 |
Henry S. Thompson |
thinking about merging
|
Thu, 03 Oct 2024 18:16:05 +0100 |
Henry S. Thompson |
cdx_extras and unpackz.py working
|
Tue, 01 Oct 2024 16:00:22 +0100 |
Henry S. Thompson |
unpackz.py working
|
Thu, 26 Sep 2024 17:47:58 +0100 |
Henry S. Thompson |
foo
|
Sun, 22 Sep 2024 23:13:56 +0100 |
Henry S. Thompson |
turn attention to nutch-cc and its Cdx code
|
Thu, 05 Sep 2024 17:59:02 +0100 |
Henry S. Thompson |
more downloads,
|
Mon, 02 Sep 2024 15:02:01 +0100 |
Henry S. Thompson |
nearly finished downloading for now
|
Wed, 21 Aug 2024 16:11:40 +0100 |
Henry S. Thompson |
extract actual date info for WARC crawls
|
Tue, 20 Aug 2024 15:27:47 +0100 |
Henry S. Thompson |
start lab notes for LURID3
|
Tue, 23 Apr 2024 14:16:04 +0100 |
Henry S. Thompson |
Nearly there
|
Tue, 23 Apr 2024 12:53:47 +0100 |
Henry S. Thompson |
starting Approach
|
Tue, 23 Apr 2024 11:53:35 +0100 |
Henry S. Thompson |
Vision complete
|
Tue, 23 Apr 2024 11:53:06 +0100 |
Henry S. Thompson |
filled in
|
Mon, 22 Apr 2024 18:24:56 +0100 |
Henry S. Thompson |
copied from xml
|