(This is the 51st of many promised articles which explain an idea in isolation. It is hoped that ideas may be adapted, linked together and implemented.)
I've described an outline of my ideal filing system. My ideal database allows generalized use of the full-text search facility. This requires an outline of proposed volume storage. This borrows widely from ReiserFS, ZFS and MySQL Server storage engines such as InnoDB, Nitro and NDB Cluster.
Media is striped into 512MB fragments and each unit has a map of stripe allocation types where one type is no allocation and another type is bad stripe. It is envisioned that six out of 67 stripes perform parity functions and this is rotated in a procession across units. Each stripe has a bad sector map. For 1KB sectors, this requires 64KB. For 4KB sectors, this requires 16KB. If these sectors are bad, the stripe cannot be used. If the stripe map is bad, the unit cannot be used.
The remainder of sectors within a stripe are available to an application. However, applications may not be expecting raw, disjoint storage and therefore a standard contiguous mapping with redundancy may be utilized. The full-text search for the filing system utilizes a specialized database in which search terms are striped by length with attributes such as capitalization and accents. Conceptually, "House++" would be stored as HOUSE-15-10000-00000 where digits represent punctuation, capitalization and accented characters. Sequential entries would be compressed into fragments occupying 13 or 21 sequential sectors and would be split or reconciled as required.
The general database storage requires three or more 512MB stripes per table. One or more stripes hold 13 sector fragments. One or more stripes hold 21 sector fragments. One or more stripes hold the index of fragments. All rows within a table are stored in N-dimensional Peano curve format and therefore the table is its own universal index. Sets of eight rows are bit transposed, Peano mixed and arithmetic compressed into one stream. If a fragment exceeds 13 sectors, it is placed into a 21 sector fragment. If a fragment exceeds 21 sectors, it is placed into two 13 sector fragments. All CHAR and VARCHAR fields which are longer than 13 bytes are stored in shadow tables which require their own 512MB stripes. Each definition of VARCHAR(255) requires several gigabytes of space. BLOB fields are stored in an unindexed Fibonacci filing system.
If you doubt the wisdom of chopping data so finely then please investigate the sort utility used in a previous project.