Mercurial > hg > cc > cirrus_work
graph
-
for pub15 months ago, by Henry S. Thompson
-
tweaked formatting15 months ago, by Henry S. Thompson
-
excel rewrote, no important changes (?)15 months ago, by Henry S. Thompson
-
replace wrong one with right one15 months ago, by Henry S. Thompson
-
merge15 months ago, by Henry S. Thompson
-
implement alternative confidence measure using stats.bootstrap,15 months ago, by Henry S. Thompson
-
for LMh percentile15 months ago, by Henry S. Thompson
-
decorated15 months ago, by Henry S. Thompson
-
merge15 months ago, by Henry S. Thompson
-
can't add props to DescribeResult15 months ago, by Henry S. Thompson
-
for 2023-4015 months ago, by Henry S. Thompson
-
with decorations15 months ago, by Henry S. Thompson
-
excel rewrote, no important changes (?)15 months ago, by Henry S. Thompson
-
with percentile instead of raw mean correl15 months ago, by Henry S. Thompson
-
change heatmap to by percentile15 months ago, by Henry S. Thompson
-
with heat15 months ago, by Henry S. Thompson
-
heat map for mime vs. nl1 vs. len15 months ago, by Henry S. Thompson
-
add head_map fn15 months ago, by Henry S. Thompson
-
add explore_deltas and predict analysis fns15 months ago, by Henry S. Thompson
-
rename to avoid name clash with scipy.stats15 months ago, by Henry S. Thompson
-
move to class with local vars instead of many globals15 months ago, by Henry S. Thompson
-
renamed to by_interval.py15 months ago, by Henry S. Thompson
-
renamed from spearman.py15 months ago, by Henry S. Thompson
-
renamed to stats.py15 months ago, by Henry S. Thompson
-
do the __main__ thing15 months ago, by Henry S. Thompson
-
put results in numbered subdirs15 months ago, by Henry S. Thompson
-
add minimal logging and don't return until finished15 months ago, by Henry S. Thompson
-
should work for months also now15 months ago, by Henry S. Thompson
-
cross-language confusion :-)15 months ago, by Henry S. Thompson
-
LM plot for multiple crawls, magnitude or %age15 months ago, by Henry S. Thompson
-
can overlay the two16 months ago, by Henry S. Thompson
-
fix output year16 months ago, by Henry S. Thompson
-
sic16 months ago, by Henry S. Thompson
-
sic16 months ago, by Henry S. Thompson
-
get in/out file management working right16 months ago, by Henry S. Thompson
-
refactor to provide for buffer overflow fix16 months ago, by Henry S. Thompson
-
bug-fix wrt 1st time,16 months ago, by Henry S. Thompson
-
make extra file info optional16 months ago, by Henry S. Thompson
-
forget parallel, just do (default 2) parallel single threads16 months ago, by Henry S. Thompson
-
add missing makedir16 months ago, by Henry S. Thompson
-
now does one named segment only16 months ago, by Henry S. Thompson
-
resurrect parallel fetch16 months ago, by Henry S. Thompson
-
convert to single thread,16 months ago, by Henry S. Thompson
-
avoid global name conflict16 months ago, by Henry S. Thompson
-
moved from /beegfs/common-crawl to get under .hg16 months ago, by Henry S. Thompson
-
fix typo16 months ago, by Henry S. Thompson
-
build cluster.idx17 months ago, by Henry S. Thompson
-
no longer using cmp_to_key17 months ago, by Henry S. Thompson
-
new branch to save do_idx.sh from abandoned merge fixup mergefix17 months ago, by Henry S. Thompson
-
try to get the counts right, particularly when re-merging17 months ago, by Henry S. Thompson
-
for use in debugging, see notes and tests 2, 17, merge test17 months ago, by Henry S. Thompson
-
add various www deletion cases17 months ago, by Henry S. Thompson
-
iterate WPAT fix with improved pattern17 months ago, by Henry S. Thompson
-
loosen WARC pattern to avoid failure from "mime" = "{...}" intervening17 months ago, by Henry S. Thompson
-
refactor to enable rerun with fixup,17 months ago, by Henry S. Thompson
-
correct mistaken futnsz test,17 months ago, by Henry S. Thompson
-
change path to merge_date.py17 months ago, by Henry S. Thompson
-
remove the mistaken deletion of NONPRINT,17 months ago, by Henry S. Thompson
-
fix a bad fix and a bad test for the televida case17 months ago, by Henry S. Thompson
-
fix and test for all-decimal host17 months ago, by Henry S. Thompson
-
no import in lmh.__init__ any more17 months ago, by Henry S. Thompson
-
importing in __init__ causes problems17 months ago, by Henry S. Thompson
-
commented out duplicate, handle comments better17 months ago, by Henry S. Thompson
-
more corner case tests17 months ago, by Henry S. Thompson
-
tweaks to get all tests through #1417 months ago, by Henry S. Thompson
-
get 7f (two cases) and %25 working17 months ago, by Henry S. Thompson
-
add televida case test17 months ago, by Henry S. Thompson
-
add test description17 months ago, by Henry S. Thompson
-
importable just in case17 months ago, by Henry S. Thompson
-
move most of the hacking into fixGoogleCanon,17 months ago, by Henry S. Thompson
-
forget assert, allow multiple failures17 months ago, by Henry S. Thompson
-
x17 months ago, by Henry S. Thompson
-
found right place for \x7f hack, maybe17 months ago, by Henry S. Thompson
-
readability17 months ago, by Henry S. Thompson
-
x17 months ago, by Henry S. Thompson
-
refactor to sort a module in an lmh package17 months ago, by Henry S. Thompson
-
start some regression tests17 months ago, by Henry S. Thompson
-
creating lmh package17 months ago, by Henry S. Thompson
-
moved from bin17 months ago, by Henry S. Thompson
-
minor bug wrt EOF of final cdx input file17 months ago, by Henry S. Thompson
-
replicate two extremely-corner cases of the way17 months ago, by Henry S. Thompson
-
a bit more logging17 months ago, by Henry S. Thompson
-
a bit more logging17 months ago, by Henry S. Thompson
-
robotstxt and crawldiagnostics get free ride,17 months ago, by Henry S. Thompson
-
a few more from ecclerig,17 months ago, by Henry S. Thompson
-
refactor datestream reading,17 months ago, by Henry S. Thompson
-
more faithful regexps and non-byte uri output17 months ago, by Henry S. Thompson
-
one uncommited fix from quentin17 months ago, by Henry S. Thompson
-
pass in debug flag(s) to merge_date.py17 months ago, by Henry Thompson
-
loosen must-match criterion in the both-messy case17 months ago, by Henry Thompson
-
one more sid fix,17 months ago, by Henry Thompson
-
working on sessionID pblms, still17 months ago, by Henry S. Thompson
-
first try17 months ago, by Henry Thompson
-
switch to gzip -7 to get comparable compressed cdx block size17 months ago, by Henry S. Thompson
-
use my own Canonicalizer to fix more obscure17 months ago, by Henry S. Thompson
-
re-instate logging splits for .idx17 months ago, by Henry S. Thompson
-
reinstate better check to start queuing,17 months ago, by Henry S. Thompson
-
bug4 fixed, but that created a new, earlier bug17 months ago, by Henry S. Thompson
-
rework handling of session key problem17 months ago, by Henry S. Thompson
-
initialise paths for csing17 months ago, by Henry S. Thompson
-
d'oh17 months ago, by Henry S. Thompson
-
include full URI in output17 months ago, by Henry S. Thompson
-
try to do csing correctly on compute nodes17 months ago, by Henry S. Thompson
-
version which outputs more identification,17 months ago, by Henry S. Thompson
-
last version before giving up on approach based only on key and datestamp17 months ago, by Henry S. Thompson
-
improve reordering, still failing on cdx-0000418 months ago, by Henry S. Thompson
-
attempt at reordering if necessary18 months ago, by Henry S. Thompson
-
mostly working, but need to reorder in case of cfid and friends18 months ago, by Henry S. Thompson
-
flip loops18 months ago, by Henry S. Thompson
-
merge a stream of ks files with a set of cdx files18 months ago, by Henry S. Thompson
-
final keystroke fixes, recurse and decimal www stripping18 months ago, by Henry S. Thompson
-
final keystroke fixes,18 months ago, by Henry S. Thompson
-
handle double .www, more keep-me chars18 months ago, by Henry S. Thompson
-
work-around for weird handling of %-encoding in Java impl. of SURT18 months ago, by Henry S. Thompson
-
merge, including pointless fix wrt pq18 months ago, by Henry Thompson
-
use surt instead of trying to create index term by hand18 months ago, by Henry Thompson
-
merge18 months ago, by Henry Thompson
-
stale18 months ago, by Henry Thompson
-
catching up by hand with markup version,18 months ago, by Henry Thompson
-
include timestamp18 months ago, by Henry S. Thompson
-
include query18 months ago, by Henry S. Thompson
-
make CC's own sorting explicit18 months ago, by Henry S. Thompson
-
handle corner cases with final . and initial www..+18 months ago, by Henry S. Thompson
-
handle %-encoded utf-8 as idna18 months ago, by Henry S. Thompson
-
merge18 months ago, by Henry S. Thompson
-
compute timestamps, key and sort lmh lines18 months ago, by Henry S. Thompson
-
work with csing18 months ago, by Henry S. Thompson
-
get man -k working18 months ago, by Henry S. Thompson
-
for warc_lmh slurm logs19 months ago, by Henry Thompson
-
for timing analysis19 months ago, by Henry S. Thompson
-
add support for multiple calls to srun with a counter19 months ago, by Henry S. Thompson
-
fix eof bug, expand error messages19 months ago, by Henry S. Thompson
-
part 2 is now working for all types19 months ago, by Henry S. Thompson
-
add a response-only test19 months ago, by Henry S. Thompson
-
revert to just showing first LM19 months ago, by Henry S. Thompson
-
more tests19 months ago, by Henry S. Thompson
-
Test 2 works with parts=1,2,3.19 months ago, by Henry S. Thompson
-
whole working19 months ago, by Henry S. Thompson
-
tests 1 & 2 now working19 months ago, by Henry S. Thompson
-
avoid slicing buf by using memoryview to save copying19 months ago, by Henry S. Thompson
-
but skip at eobp is not working (with test 2)19 months ago, by Henry S. Thompson
-
works with all types, part=119 months ago, by Henry S. Thompson
-
rework completely to refill as much as possible only when necessary,19 months ago, by Henry S. Thompson
-
finds multiples19 months ago, by Henry S. Thompson
-
little steps20 months ago, by Henry S. Thompson
-
made 1 mean 1, still losing after a while20 months ago, by Henry S. Thompson
-
better debugging output20 months ago, by Henry S. Thompson
-
working better, gets confused by 3-part response20 months ago, by Henry S. Thompson
-
a bit better20 months ago, by Henry S. Thompson
-
just barely working for 1, need to rethink buffering20 months ago, by Henry S. Thompson
-
starting on conversion to direct-querying of buffer20 months ago, by Henry S. Thompson
-
sic20 months ago, by Henry S. Thompson
-
support on-board unzipping, reduce buffer size to 2MB20 months ago, by Henry S. Thompson
-
make test 1 idempotent20 months ago, by Henry S. Thompson
-
just count part length20 months ago, by Henry S. Thompson
-
get EOF right, finally20 months ago, by Henry S. Thompson
-
make warc.py a library, separate out testing20 months ago, by Henry S. Thompson
-
correct comment20 months ago, by Henry S. Thompson
-
add lots more debugging output,20 months ago, by Henry S. Thompson
-
moved from home bin20 months ago, by Henry S. Thompson
-
doc pointer2023-01-10, by Henry S. Thompson
-
push actions in main fn2022-12-13, by Henry S. Thompson
-
fixed for paper2022-12-13, by Henry S. Thompson
-
fix N2022-11-24, by Henry S. Thompson
-
compute and graph confidence intervals2022-11-23, by Henry S. Thompson
-
generalise hist2022-11-22, by Henry S. Thompson
-
add sort flag to plot_x2022-11-22, by Henry S. Thompson
-
get multi-ranking done right2022-11-17, by Henry S. Thompson
-
comments and more care about rows vs. columns2022-11-17, by Henry S. Thompson
-
start work on ranking,2022-11-16, by Henry S. Thompson
-
Spearman for matlab2022-11-16, by Henry S. Thompson
-
move all plots into functions2022-11-16, by Henry S. Thompson
-
a bit more2022-11-15, by Henry S. Thompson
-
framework for stats over results of rank correlations2022-11-14, by Henry S. Thompson
-
first plot efforts w. scipy2022-11-11, by Henry S. Thompson
-
sic2022-10-21, by Henry S. Thompson
-
accept filenames on stdin,2022-09-29, by Henry S. Thompson
-
interpolate process0, support permanent subproc2022-09-29, by Henry S. Thompson
-
new2022-09-29, by Henry S. Thompson
-
new2022-09-29, by Henry S. Thompson
-
write to tmp file implemented2022-08-07, by Henry S. Thompson
-
use awk for simple cut2022-08-07, by Henry S. Thompson
-
toward link extractions from pdf2022-08-07, by Henry S. Thompson
-
in progress...2022-08-07, by Henry S. Thompson
-
x2022-08-07, by Henry S. Thompson
-
x2022-07-28, by Henry S. Thompson
-
fix quoting pblm by using parallel ... -q2022-07-28, by Henry S. Thompson
-
catch-up2022-07-28, by Henry S. Thompson
-
minimal hst preferred options2022-07-23, by Henry S. Thompson
-
work around problem with PROMPT_COMMAND2022-07-23, by Henry S. Thompson
-
x2022-07-20, by Henry S. Thompson
-
fix PROMPT_COMMAND2022-07-20, by Henry S. Thompson
-
x2022-07-20, by Henry S. Thompson
-
tidy up and include uniq -c2022-07-20, by Henry S. Thompson
-
convert to no longer need uniq -c2022-07-20, by Henry S. Thompson
-
oops, 1.1 was half-modified, bogus2022-07-19, by Henry S. Thompson
-
compute node workers, see cirrus_home/bin repo for login node masters2022-07-18, by Henry S. Thompson
-
getting started2022-07-18, by Henry S. Thompson
-
getting started2022-07-18, by Henry S. Thompson