changeset 40:4167d8f33325

start lab notes for LURID3
author Henry S. Thompson <ht@inf.ed.ac.uk>
date Tue, 20 Aug 2024 15:27:47 +0100
parents ddedac65afa2
children 64b7fb44e8dc
files lurid3/notes.txt lurid3/status.xlsx
diffstat 2 files changed, 10 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/lurid3/notes.txt	Tue Aug 20 15:27:47 2024 +0100
@@ -0,0 +1,10 @@
+See old_notes.txt for all older notes on Common Crawl dataprocessing,
+starting from Azure via Turing and then LURID and LURID2.
+
+Installed /beegfs/common_crawl/CC-MAIN-2024-33/cdx
+  >: cd results/CC-MAIN-2024-33/cdx/
+  >: cut -f 2 counts.tsv | btot
+  2,793,986,828 
+
+State of play wrt data -- see status.xlsx
+
Binary file lurid3/status.xlsx has changed