Mercurial > hg > ooxml
comparison notes.txt @ 23:bfa38afaea63
change to default ns
author | Henry S. Thompson <ht@markup.co.uk> |
---|---|
date | Thu, 06 Apr 2017 16:47:53 +0100 |
parents | ca98c74a7cb1 |
children | 87e0d620deea |
comparison
equal
deleted
inserted
replaced
22:ca98c74a7cb1 | 23:bfa38afaea63 |
---|---|
43 with (numerical) formats, but no content. Where do I throw those | 43 with (numerical) formats, but no content. Where do I throw those |
44 away? Can throw away empty _rows_ in rect.xsl, but for _cells_ have | 44 away? Can throw away empty _rows_ in rect.xsl, but for _cells_ have |
45 to wait for ascii.xsl or html.xsl. But only copy type in in rect if | 45 to wait for ascii.xsl or html.xsl. But only copy type in in rect if |
46 there was content before. | 46 there was content before. |
47 ----------- | 47 ----------- |
48 Using attributes to hold space-separated lists is risky, as in | |
49 refs.xsl output, is risky! | |
50 ----------- | |
48 Not handling variables as references. Not catching external | 51 Not handling variables as references. Not catching external |
49 references to variables. Not catching naked [n]! as external | 52 references to variables. Not catching naked [n]! as external |
50 references. | 53 references. |
51 Fixed, but not dereferenced vars | 54 Fixed, but not dereferenced vars |
52 The definition table is in workbook.xml definedNames/definedName[@name=$name]/. | 55 The definition table is in workbook.xml definedNames/definedName[@name=$name]/. |
53 Sheet name to filename mapping for locals is in workbook.xml sheets/sheet[@name=$sname]/@sheetId | 56 Sheet name to filename mapping for locals is in workbook.xml sheets/sheet[@name=$sname]/@sheetId |
54 | 57 ----------- |
58 Switch to default namespace in order to reduce size and improve readability | |
59 ----------- | |
60 Should put another step after refs.xsl to compute a map from | |
61 distinct-values of all targets to all the cells which use them | |
62 (likewise ranges). That really does mean we should move to elts for | |
63 each ref or range, since at this point we want to compute vector | |
64 representation as well, so we can identify projections | |
65 | |
66 Slightly irritating that we'll have to serialise this as XML and then | |
67 re-build it later... | |
68 ----------- | |
69 Overgenerating in kenneth_lay__19506: e.g. <e:ref c="E9" er="[1]!'.SPX' '.SPX'!"/> | |
70 from <f>[1]!'.SPX'</f> | |
71 Hmm. This cell displays in Excel as REUTERS|IDN!.SPX | |
72 The indirections work as follows: | |
73 in workbook.xml: | |
74 <externalReferences> | |
75 <externalReference r:id="rId3"/> | |
76 <externalReference r:id="rId4"/> | |
77 </externalReferences> | |
78 in _rels/workbook.xml.rels | |
79 <Relationship Id="rId3" Target="externalLinks/externalLink1.xml" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/externalLink"/> | |
80 in externalLinks/externalLink1.xml | |
81 <ddeLink ddeService="REUTER" ddeTopic="IDN"... | |
82 <ddeItems> | |
83 ... | |
84 <ddeItem advise="1" name=".SPX"> | |
85 <values> | |
86 <value> | |
87 <val>1264.96</val> | |
88 </value> | |
89 </values> | |
90 </ddeItem> | |
91 Whew! | |
92 ---------- | |
93 Tried the largest sheet from the largest .xlsx I could find: | |
94 fuse1k/'benjamin_rogers__1002__NYISO Price Information version 2'.xlsx | |
95 -rw-r--r-- 1 ht None 6273325 Apr 3 16:22 '../benjamin_rogers__1002__NYISO Price Information version 2.xlsx' | |
96 -rw-r--r-- 1 ht None 23221149 Jan 1 1980 xl/worksheets/sheet3.xml | |
97 | |
98 > lxcount xl/worksheets/sheet3.xml | sort -k2nr | |
99 *Total* 1230217 | |
100 c 596032 | |
101 v 595876 | |
102 f 19201 | |
103 row 18985 | |
104 col 106 | |
105 | |
106 <dimension ref="A1:DY18985"/> | |
107 | |
108 Blew java out of the water :-( | |
109 java.lang.OutOfMemoryError: Java heap space | |
110 | |
111 Need to try again with more memory, if I remember how... | |
112 | |
113 The raw result is going to have 18985 x 102 == 2 million cells == | |
114 (assuming average cell size of 30 bytes and row overhead of 20 (* | |
115 18985 (+ 20 (* 102 30))) 58,473,800 bytes, which is big but tolerable... | |
116 ---------------- | |
117 Back to ranges - | |
118 |