btree.rst 5.6 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159
  1. :mod:`btree` -- simple BTree database
  2. =====================================
  3. .. module:: btree
  4. :synopsis: simple BTree database
  5. The ``btree`` module implements a simple key-value database using external
  6. storage (disk files, or in general case, a random-access `stream`). Keys are
  7. stored sorted in the database, and besides efficient retrieval by a key
  8. value, a database also supports efficient ordered range scans (retrieval
  9. of values with the keys in a given range). On the application interface
  10. side, BTree database work as close a possible to a way standard `dict`
  11. type works, one notable difference is that both keys and values must
  12. be `bytes` objects (so, if you want to store objects of other types, you
  13. need to serialize them to `bytes` first).
  14. The module is based on the well-known BerkelyDB library, version 1.xx.
  15. Example::
  16. import btree
  17. # First, we need to open a stream which holds a database
  18. # This is usually a file, but can be in-memory database
  19. # using uio.BytesIO, a raw flash partition, etc.
  20. # Oftentimes, you want to create a database file if it doesn't
  21. # exist and open if it exists. Idiom below takes care of this.
  22. # DO NOT open database with "a+b" access mode.
  23. try:
  24. f = open("mydb", "r+b")
  25. except OSError:
  26. f = open("mydb", "w+b")
  27. # Now open a database itself
  28. db = btree.open(f)
  29. # The keys you add will be sorted internally in the database
  30. db[b"3"] = b"three"
  31. db[b"1"] = b"one"
  32. db[b"2"] = b"two"
  33. # Assume that any changes are cached in memory unless
  34. # explicitly flushed (or database closed). Flush database
  35. # at the end of each "transaction".
  36. db.flush()
  37. # Prints b'two'
  38. print(db[b"2"])
  39. # Iterate over sorted keys in the database, starting from b"2"
  40. # until the end of the database, returning only values.
  41. # Mind that arguments passed to values() method are *key* values.
  42. # Prints:
  43. # b'two'
  44. # b'three'
  45. for word in db.values(b"2"):
  46. print(word)
  47. del db[b"2"]
  48. # No longer true, prints False
  49. print(b"2" in db)
  50. # Prints:
  51. # b"1"
  52. # b"3"
  53. for key in db:
  54. print(key)
  55. db.close()
  56. # Don't forget to close the underlying stream!
  57. f.close()
  58. Functions
  59. ---------
  60. .. function:: open(stream, \*, flags=0, pagesize=0, cachesize=0, minkeypage=0)
  61. Open a database from a random-access `stream` (like an open file). All
  62. other parameters are optional and keyword-only, and allow to tweak advanced
  63. parameters of the database operation (most users will not need them):
  64. * *flags* - Currently unused.
  65. * *pagesize* - Page size used for the nodes in BTree. Acceptable range
  66. is 512-65536. If 0, a port-specific default will be used, optimized for
  67. port's memory usage and/or performance.
  68. * *cachesize* - Suggested memory cache size in bytes. For a
  69. board with enough memory using larger values may improve performance.
  70. Cache policy is as follows: entire cache is not allocated at once;
  71. instead, accessing a new page in database will allocate a memory buffer
  72. for it, until value specified by *cachesize* is reached. Then, these
  73. buffers will be managed using LRU (least recently used) policy. More
  74. buffers may still be allocated if needed (e.g., if a database contains
  75. big keys and/or values). Allocated cache buffers aren't reclaimed.
  76. * *minkeypage* - Minimum number of keys to store per page. Default value
  77. of 0 equivalent to 2.
  78. Returns a BTree object, which implements a dictionary protocol (set
  79. of methods), and some additional methods described below.
  80. Methods
  81. -------
  82. .. method:: btree.close()
  83. Close the database. It's mandatory to close the database at the end of
  84. processing, as some unwritten data may be still in the cache. Note that
  85. this does not close underlying stream with which the database was opened,
  86. it should be closed separately (which is also mandatory to make sure that
  87. data flushed from buffer to the underlying storage).
  88. .. method:: btree.flush()
  89. Flush any data in cache to the underlying stream.
  90. .. method:: btree.__getitem__(key)
  91. btree.get(key, default=None)
  92. btree.__setitem__(key, val)
  93. btree.__detitem__(key)
  94. btree.__contains__(key)
  95. Standard dictionary methods.
  96. .. method:: btree.__iter__()
  97. A BTree object can be iterated over directly (similar to a dictionary)
  98. to get access to all keys in order.
  99. .. method:: btree.keys([start_key, [end_key, [flags]]])
  100. btree.values([start_key, [end_key, [flags]]])
  101. btree.items([start_key, [end_key, [flags]]])
  102. These methods are similar to standard dictionary methods, but also can
  103. take optional parameters to iterate over a key sub-range, instead of
  104. the entire database. Note that for all 3 methods, *start_key* and
  105. *end_key* arguments represent key values. For example, `values()`
  106. method will iterate over values corresponding to they key range
  107. given. None values for *start_key* means "from the first key", no
  108. *end_key* or its value of None means "until the end of database".
  109. By default, range is inclusive of *start_key* and exclusive of
  110. *end_key*, you can include *end_key* in iteration by passing *flags*
  111. of `btree.INCL`. You can iterate in descending key direction
  112. by passing *flags* of `btree.DESC`. The flags values can be ORed
  113. together.
  114. Constants
  115. ---------
  116. .. data:: INCL
  117. A flag for `keys()`, `values()`, `items()` methods to specify that
  118. scanning should be inclusive of the end key.
  119. .. data:: DESC
  120. A flag for `keys()`, `values()`, `items()` methods to specify that
  121. scanning should be in descending direction of keys.