Whether you know, or care, most database software has hard coded limits on the number of columns that you can store in a single table.
Often, if you hit this limit for some reason, it indicates that ‘you are doing it wrong.’ However, this is not always the case.
For instance, say you want to store some information that is a function of isotope. Well, exist some 3000+ nuclides that are known to exist in nature. If you want to store this data in a single table and your database allows 4096 columns per table, you are in luck! However, if you go down by a single power of two, there are only 2048 columns per table and you are in trouble.
Below is a list of different table structures and their implicit table limits:
- MySQL 5: 4,096
- MSSQL: 1,024
- Oracle: 1,000
- Excel 2003: 256
- Excel 2007: 16,384
- HDF5: ~1,260
- NumPy: +inf
That’s right; NumPy structured (or record) arrays seem to allow as many columns as you can fit into memory! Below is some code that demonstrates this. I tested it up to a million columns, but I am certain that it could handle a few more orders of magnitude.
This situation is peculiar since traditional databases are limited by disk size, not memory, like NumPy. (HDDs, of course, being much larger than RAM.)
Go to the Source
import numpy as np import uuid from random import sample, randint population = [int, float, 'S'] def rand\_dtype(n): dt =  for i in xrange(n): dt.extend( [(str(uuid.uuid1()), np.dtype(d)) for d in sample(population, 1)] ) dt = np.dtype(dt) return dt if __name__ == "__main__": dt = rand\_dtype(1000000) z = np.zeros(10, dtype=dt)