Mercurial > hg > cc > cirrus_home
log
age | author | description |
---|---|---|
2021-03-03 | Henry S. Thompson | trying to move to slurm |
2020-05-09 | Henry S. Thompson | improved F handling/logging |
2020-05-08 | Henry S. Thompson | keep separate antecedants separate, buggy? |
2020-05-07 | Henry S. Thompson | track redirects, need to us full crawldiagnostics.warc.gz for "location:" and "Uri:" |
2020-05-07 | Henry S. Thompson | refactor, change summary print (problem?) |
2020-05-06 | Henry S. Thompson | bare framework working |
2020-05-06 | Henry S. Thompson | starting on tool to assemble as complete as we have info wrt a seed URI |
2020-05-06 | Henry S. Thompson | use local .m2/repository for Hadoop 3.4.0 |
2020-05-06 | Henry S. Thompson | works for big files with Hadoop 3.4.0 |
2020-05-06 | Henry S. Thompson | x |
2020-04-28 | Henry S. Thompson | log trucations |
2020-04-28 | Henry S. Thompson | impose some limits |
2020-04-28 | Henry S. Thompson | x |
2020-04-24 | Henry S. Thompson | x |
2020-04-24 | Henry S. Thompson | mostly from Sebastian |
2020-04-24 | Henry S. Thompson | misc |
2020-04-24 | Henry S. Thompson | misc |
2020-04-24 | Henry S. Thompson | fix from Sebastian |
2020-04-24 | Henry S. Thompson | misc |
2020-04-24 | Henry S. Thompson | misc |
2020-04-24 | Henry S. Thompson | several efficiency (hofentlich) tweaks |
2020-04-23 | Henry S. Thompson | x |
2020-04-23 | Henry S. Thompson | switch for use on login server, invoke by hand with 0/1 as only cmd line arg |
2020-04-22 | Henry S. Thompson | java stuff |
2020-04-22 | Henry S. Thompson | try nutch fetch for big pdfs |
2020-04-15 | Henry S. Thompson | final most general versin |
2020-04-14 | Henry S. Thompson | too big for /dev/shm, split in half |
2020-04-14 | Henry S. Thompson | one-off to convert big extracts.tar into lots of smaller ones |