Mercurial > hg > cc > work
comparison lurid3/notes.txt @ 64:a70ceb9d1e82 default tip
progress with cdb
author | Henry S. Thompson <ht@inf.ed.ac.uk> |
---|---|
date | Sat, 18 Jan 2025 21:33:00 +0000 |
parents | 663e55844c1d |
children |
comparison
equal
deleted
inserted
replaced
63:663e55844c1d | 64:a70ceb9d1e82 |
---|---|
1117 11195226 3.058 0.000 3.058 0.000 {built-in method isal.isal_zlib.crc32} | 1117 11195226 3.058 0.000 3.058 0.000 {built-in method isal.isal_zlib.crc32} |
1118 11195236 2.784 0.000 2.784 0.000 {method 'write' of '_io.BufferedWriter' objects} | 1118 11195236 2.784 0.000 2.784 0.000 {method 'write' of '_io.BufferedWriter' objects} |
1119 11064996/11064988 1.343 0.000 1.343 0.000 {built-in method builtins.len} | 1119 11064996/11064988 1.343 0.000 1.343 0.000 {built-in method builtins.len} |
1120 11195324 1.309 0.000 1.309 0.000 {built-in method builtins.isinstance} | 1120 11195324 1.309 0.000 1.309 0.000 {built-in method builtins.isinstance} |
1121 2 0.040 0.020 0.040 0.020 {built-in method io.open} | 1121 2 0.040 0.020 0.040 0.020 {built-in method io.open} |
1122 | |
1123 After a long diversion to get cython access to C extensions working, | |
1124 now have a setup that works, in lib/python/cc/lmh: | |
1125 | |
1126 >: python3 setup.py build_ext -i | |
1127 >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb xyzzy | |
1128 testing... | |
1129 0 | |
1130 tested | |
1131 >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb 20190825142846http://71.43.189.10/dermorph/ | |
1132 testing... | |
1133 1 | |
1134 tested | |
1135 >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb 20190825142846http://71.43.189.10/dermorph/ 10000000 | |
1136 testing... | |
1137 1 | |
1138 tested | |
1139 1.8913232665508986 | |
1140 >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb xyzzy 10000000 | |
1141 testing... | |
1142 0 | |
1143 tested | |
1144 1.31287781894207 | |
1145 sing<3792>: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb xyzzy 10000000 | |
1146 testing... | |
1147 0 | |
1148 tested | |
1149 1.3081446290016174 | |
1150 So, maybe that's hopeful | |
1151 | |
1152 to get the values out: | |
1153 https://candide-guevara.github.io/cs_related/2018/02/28/cython-memory-mgt.html | |
1154 mmap->memoryview, maybe? | |
1155 or | |
1156 cdef char[::1] mview = <char[:size:1]>(bp) | |
1157 https://stackoverflow.com/questions/50228544/how-do-i-read-a-c-char-array-into-a-python-bytearray-with-cython | |
1158 or, if that doesn't work, have to copy: | |
1159 https://stackoverflow.com/questions/50228544/how-do-i-read-a-c-char-array-into-a-python-bytearray-with-cython | |
1160 | |
1161 Not quite there, but, maybe, nearly: | |
1162 >: python3 -c 'import db' ~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb 20190825142846http://71.43.189.10/dermorph/ 10000000 | |
1163 testing... | |
1164 2488 10 # offset and length? | |
1165 tested | |
1166 1.9035462848842144 | |
1167 >: dd ibs=1 skip=2488 count=10 if=~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb of=/dev/stdout | |
1168 1564555978 | |
1169 >: cdbget 20190825142846http://71.43.189.10/dermorph/ <~/results/CC-MAIN-2019-35/warc_lmhx/ks_0.cdb | |
1170 1564555978 | |
1122 ================ | 1171 ================ |
1123 | 1172 |
1124 | 1173 |
1125 Try it with the existing _per segment_ index we have for 2019-35 | 1174 Try it with the existing _per segment_ index we have for 2019-35 |
1126 | 1175 |