18 months ago |
Henry S. Thompson |
attempt at reordering if necessary
|
18 months ago |
Henry S. Thompson |
mostly working, but need to reorder in case of cfid and friends
|
18 months ago |
Henry S. Thompson |
flip loops
|
18 months ago |
Henry S. Thompson |
merge a stream of ks files with a set of cdx files
|
18 months ago |
Henry S. Thompson |
final keystroke fixes, recurse and decimal www stripping
|
18 months ago |
Henry S. Thompson |
final keystroke fixes,
|
18 months ago |
Henry S. Thompson |
handle double .www, more keep-me chars
|
18 months ago |
Henry S. Thompson |
work-around for weird handling of %-encoding in Java impl. of SURT
|
18 months ago |
Henry Thompson |
merge, including pointless fix wrt pq
|
18 months ago |
Henry Thompson |
use surt instead of trying to create index term by hand
|
18 months ago |
Henry Thompson |
merge
|
18 months ago |
Henry Thompson |
stale
|
18 months ago |
Henry Thompson |
catching up by hand with markup version,
|
18 months ago |
Henry S. Thompson |
include timestamp
|
18 months ago |
Henry S. Thompson |
include query
|
18 months ago |
Henry S. Thompson |
make CC's own sorting explicit
|
19 months ago |
Henry S. Thompson |
handle corner cases with final . and initial www..+
|
19 months ago |
Henry S. Thompson |
handle %-encoded utf-8 as idna
|
19 months ago |
Henry S. Thompson |
merge
|
19 months ago |
Henry S. Thompson |
compute timestamps, key and sort lmh lines
|
19 months ago |
Henry S. Thompson |
work with csing
|
19 months ago |
Henry S. Thompson |
get man -k working
|
19 months ago |
Henry Thompson |
for warc_lmh slurm logs
|
19 months ago |
Henry S. Thompson |
for timing analysis
|
19 months ago |
Henry S. Thompson |
add support for multiple calls to srun with a counter
|
19 months ago |
Henry S. Thompson |
fix eof bug, expand error messages
|
19 months ago |
Henry S. Thompson |
part 2 is now working for all types
|
19 months ago |
Henry S. Thompson |
add a response-only test
|
19 months ago |
Henry S. Thompson |
revert to just showing first LM
|
19 months ago |
Henry S. Thompson |
more tests
|
19 months ago |
Henry S. Thompson |
Test 2 works with parts=1,2,3.
|
19 months ago |
Henry S. Thompson |
whole working
|
19 months ago |
Henry S. Thompson |
tests 1 & 2 now working
|
19 months ago |
Henry S. Thompson |
avoid slicing buf by using memoryview to save copying
|
19 months ago |
Henry S. Thompson |
but skip at eobp is not working (with test 2)
|
19 months ago |
Henry S. Thompson |
works with all types, part=1
|
20 months ago |
Henry S. Thompson |
rework completely to refill as much as possible only when necessary,
|
20 months ago |
Henry S. Thompson |
finds multiples
|
20 months ago |
Henry S. Thompson |
little steps
|
20 months ago |
Henry S. Thompson |
made 1 mean 1, still losing after a while
|
20 months ago |
Henry S. Thompson |
better debugging output
|
20 months ago |
Henry S. Thompson |
working better, gets confused by 3-part response
|
20 months ago |
Henry S. Thompson |
a bit better
|
20 months ago |
Henry S. Thompson |
just barely working for 1, need to rethink buffering
|
20 months ago |
Henry S. Thompson |
starting on conversion to direct-querying of buffer
|
20 months ago |
Henry S. Thompson |
sic
|
20 months ago |
Henry S. Thompson |
support on-board unzipping, reduce buffer size to 2MB
|
20 months ago |
Henry S. Thompson |
make test 1 idempotent
|
20 months ago |
Henry S. Thompson |
just count part length
|
20 months ago |
Henry S. Thompson |
get EOF right, finally
|
20 months ago |
Henry S. Thompson |
make warc.py a library, separate out testing
|
20 months ago |
Henry S. Thompson |
correct comment
|
20 months ago |
Henry S. Thompson |
add lots more debugging output,
|
20 months ago |
Henry S. Thompson |
moved from home bin
|
2023-01-10 |
Henry S. Thompson |
doc pointer
|
2022-12-13 |
Henry S. Thompson |
push actions in main fn
|
2022-12-13 |
Henry S. Thompson |
fixed for paper
|
2022-11-24 |
Henry S. Thompson |
fix N
|
2022-11-23 |
Henry S. Thompson |
compute and graph confidence intervals
|
2022-11-22 |
Henry S. Thompson |
generalise hist
|
2022-11-22 |
Henry S. Thompson |
add sort flag to plot_x
|
2022-11-17 |
Henry S. Thompson |
get multi-ranking done right
|
2022-11-17 |
Henry S. Thompson |
comments and more care about rows vs. columns
|
2022-11-16 |
Henry S. Thompson |
start work on ranking,
|
2022-11-16 |
Henry S. Thompson |
Spearman for matlab
|
2022-11-16 |
Henry S. Thompson |
move all plots into functions
|
2022-11-15 |
Henry S. Thompson |
a bit more
|
2022-11-14 |
Henry S. Thompson |
framework for stats over results of rank correlations
|
2022-11-11 |
Henry S. Thompson |
first plot efforts w. scipy
|
2022-10-21 |
Henry S. Thompson |
sic
|
2022-09-29 |
Henry S. Thompson |
accept filenames on stdin,
|
2022-09-29 |
Henry S. Thompson |
interpolate process0, support permanent subproc
|
2022-09-29 |
Henry S. Thompson |
new
|
2022-09-29 |
Henry S. Thompson |
new
|
2022-08-07 |
Henry S. Thompson |
write to tmp file implemented
|
2022-08-07 |
Henry S. Thompson |
use awk for simple cut
|
2022-08-07 |
Henry S. Thompson |
toward link extractions from pdf
|
2022-08-07 |
Henry S. Thompson |
in progress...
|
2022-08-07 |
Henry S. Thompson |
x
|
2022-07-28 |
Henry S. Thompson |
x
|
2022-07-28 |
Henry S. Thompson |
fix quoting pblm by using parallel ... -q
|
2022-07-28 |
Henry S. Thompson |
catch-up
|
2022-07-23 |
Henry S. Thompson |
minimal hst preferred options
|
2022-07-23 |
Henry S. Thompson |
work around problem with PROMPT_COMMAND
|
2022-07-20 |
Henry S. Thompson |
x
|
2022-07-20 |
Henry S. Thompson |
fix PROMPT_COMMAND
|
2022-07-20 |
Henry S. Thompson |
x
|
2022-07-20 |
Henry S. Thompson |
tidy up and include uniq -c
|
2022-07-20 |
Henry S. Thompson |
convert to no longer need uniq -c
|
2022-07-19 |
Henry S. Thompson |
oops, 1.1 was half-modified, bogus
|
2022-07-18 |
Henry S. Thompson |
compute node workers, see cirrus_home/bin repo for login node masters
|
2022-07-18 |
Henry S. Thompson |
getting started
|
2022-07-18 |
Henry S. Thompson |
getting started
|