Mercurial > hg > cc > work
changeset 64:a70ceb9d1e82 default tip
progress with cdb
author | Henry S. Thompson <ht@inf.ed.ac.uk> |
---|---|
date | Sat, 18 Jan 2025 21:33:00 +0000 |
parents | 663e55844c1d |
children | |
files | lurid3/notes.txt |
diffstat | 1 files changed, 49 insertions(+), 0 deletions(-) [+] |
line wrap: on
line diff
--- a/lurid3/notes.txt Mon Jan 06 17:59:20 2025 +0000 +++ b/lurid3/notes.txt Sat Jan 18 21:33:00 2025 +0000 @@ -1119,6 +1119,55 @@ 11064996/11064988 1.343 0.000 1.343 0.000 {built-in method builtins.len} 11195324 1.309 0.000 1.309 0.000 {built-in method builtins.isinstance} 2 0.040 0.020 0.040 0.020 {built-in method io.open} + +After a long diversion to get cython access to C extensions working, +now have a setup that works, in lib/python/cc/lmh: + + >: python3 setup.py build_ext -i + >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb xyzzy + testing... + 0 + tested + >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb 20190825142846http://71.43.189.10/dermorph/ + testing... + 1 + tested + >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb 20190825142846http://71.43.189.10/dermorph/ 10000000 + testing... + 1 + tested + 1.8913232665508986 + >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb xyzzy 10000000 + testing... + 0 + tested + 1.31287781894207 + sing<3792>: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb xyzzy 10000000 + testing... + 0 + tested + 1.3081446290016174 +So, maybe that's hopeful + +to get the values out: + https://candide-guevara.github.io/cs_related/2018/02/28/cython-memory-mgt.html + mmap->memoryview, maybe? + or + cdef char[::1] mview = <char[:size:1]>(bp) + https://stackoverflow.com/questions/50228544/how-do-i-read-a-c-char-array-into-a-python-bytearray-with-cython + or, if that doesn't work, have to copy: + https://stackoverflow.com/questions/50228544/how-do-i-read-a-c-char-array-into-a-python-bytearray-with-cython + +Not quite there, but, maybe, nearly: + >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb 20190825142846http://71.43.189.10/dermorph/ 10000000 + testing... + 2488 10 # offset and length? + tested + 1.9035462848842144 + >: dd ibs=1 skip=2488 count=10 if=~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb of=/dev/stdout + 1564555978 + >: cdbget 20190825142846http://71.43.189.10/dermorph/ <~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb + 1564555978 ================