comparison lurid3/notes.txt @ 64:a70ceb9d1e82 default tip

progress with cdb
author Henry S. Thompson <ht@inf.ed.ac.uk>
date Sat, 18 Jan 2025 21:33:00 +0000
parents 663e55844c1d
children
comparison
equal deleted inserted replaced
63:663e55844c1d 64:a70ceb9d1e82
1117 11195226 3.058 0.000 3.058 0.000 {built-in method isal.isal_zlib.crc32} 1117 11195226 3.058 0.000 3.058 0.000 {built-in method isal.isal_zlib.crc32}
1118 11195236 2.784 0.000 2.784 0.000 {method 'write' of '_io.BufferedWriter' objects} 1118 11195236 2.784 0.000 2.784 0.000 {method 'write' of '_io.BufferedWriter' objects}
1119 11064996/11064988 1.343 0.000 1.343 0.000 {built-in method builtins.len} 1119 11064996/11064988 1.343 0.000 1.343 0.000 {built-in method builtins.len}
1120 11195324 1.309 0.000 1.309 0.000 {built-in method builtins.isinstance} 1120 11195324 1.309 0.000 1.309 0.000 {built-in method builtins.isinstance}
1121 2 0.040 0.020 0.040 0.020 {built-in method io.open} 1121 2 0.040 0.020 0.040 0.020 {built-in method io.open}
1122
1123 After a long diversion to get cython access to C extensions working,
1124 now have a setup that works, in lib/python/cc/lmh:
1125
1126 >: python3 setup.py build_ext -i
1127 >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb xyzzy
1128 testing...
1129 0
1130 tested
1131 >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb 20190825142846http://71.43.189.10/dermorph/
1132 testing...
1133 1
1134 tested
1135 >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb 20190825142846http://71.43.189.10/dermorph/ 10000000
1136 testing...
1137 1
1138 tested
1139 1.8913232665508986
1140 >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb xyzzy 10000000
1141 testing...
1142 0
1143 tested
1144 1.31287781894207
1145 sing<3792>: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb xyzzy 10000000
1146 testing...
1147 0
1148 tested
1149 1.3081446290016174
1150 So, maybe that's hopeful
1151
1152 to get the values out:
1153 https://candide-guevara.github.io/cs_related/2018/02/28/cython-memory-mgt.html
1154 mmap->memoryview, maybe?
1155 or
1156 cdef char[::1] mview = <char[:size:1]>(bp)
1157 https://stackoverflow.com/questions/50228544/how-do-i-read-a-c-char-array-into-a-python-bytearray-with-cython
1158 or, if that doesn't work, have to copy:
1159 https://stackoverflow.com/questions/50228544/how-do-i-read-a-c-char-array-into-a-python-bytearray-with-cython
1160
1161 Not quite there, but, maybe, nearly:
1162 >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb 20190825142846http://71.43.189.10/dermorph/ 10000000
1163 testing...
1164 2488 10 # offset and length?
1165 tested
1166 1.9035462848842144
1167 >: dd ibs=1 skip=2488 count=10 if=~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb of=/dev/stdout
1168 1564555978
1169 >: cdbget 20190825142846http://71.43.189.10/dermorph/ <~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb
1170 1564555978
1122 ================ 1171 ================
1123 1172
1124 1173
1125 Try it with the existing _per segment_ index we have for 2019-35 1174 Try it with the existing _per segment_ index we have for 2019-35
1126 1175