Distributed Datastores and Secondary Indexes

Databases and Indexing

Secondary Indexes in Distributed Databases

  1. Local secondary indexes: Cassandra, MongoDB primarily support local secondary indexes. Global secondary indexing is just map/reduce operations after lookups from these different local secondary indexes and comes with restrictions and performance implications.
  2. Global indexes: TiDB (based on spanner paper) supports global indexes, but the implementation is such, that secondary indexes are again key value pairs, no different than how actual data is stored. Again, the index here will not really be performant.
  3. No secondary indexes: This is simplest. HBase, for eg, doesn’t support secondary indexes. Implementors are free to choose any secondary indexing strategy they want. For eg, some do it by following approach 2, some use external systems like elastic search to do it. eg apache phoenix.

--

--

A curious engineer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store