Choosing the Right Storage for Application Data

  1. For homogeneous data arrays, regular files do not provide possibility for convenient and fast search. You have to create, maintain and constantly update special indexing files. Modification of the data structure is almost impossible. Metainformation is limited. There is no built-in run-time compression or encryption of data.
  2. Relational databases are well suited for homogeneous data. They comprise a set of predefined records with rigid internal format. Main advantage of relational databases is an ability to locate data quickly according to specified criterion, as well as transactional support of data integrity. Their significant shortcoming is that relational databases will not work well for large-size data of variable length (BLOB fields are usually stored separately from the rest of the record). Moreover, keeping data in relational databases requires: a) use of specific DBMS, which limits severely portability of the data and of the application itself, b) pre-planning of database structure, including interrelational links and indexing policy, c) researching details of peak loads is required for efficient database development, which also may be a serious overhead.
  3. Structured storages are somewhat analogous to a file system, i.e. storages are a specific set of enveloped named streams (files). Such storage can be stored at any location, i.e. in a single file on a disk, in a database record, or even in RAM. The main advantage of this approach is that it allows efficient adding or deleting data in an existing storage, provides the effective manipulation of data of various sizes (from small to huge). The storages represent separate units (files) and therefore can be easily relocated, copied, duplicated, backed up. There is no need to track all files generated by an application. Moreover, journal keeping makes it possible to restore content completely or partially, thus eliminating accidents or failures. The disadvantage may be relatively slower search inside these huge data arrays.
  4. ZIP archives, as a specific form of the structured storage, can be used for storing homogenous data arrays, but only in case when the most of access is read-only. Standardized nature of ZIP format makes it easy to use, especially in cross-platform applications, but this format is not suitable for the data to be modified after packing, so adding and deleting of data is a time-consuming operation.
  5. Remote and distributed storages are the next level of storage in which actual data location and data access are provided by specific layer used for encapsulating of access mechanics. In such storages data can actually be stored in databases or be distributed among different file systems, but the actual storage organization does not matter for an end-user. The user observes only a set of objects accessed through an API, or, as a variant, through file system calls. Good example is cloud storages. These types of data storages are to be used in large software complexes. Among other advantages one can mention unified data access without a need to think about actual ways how data are stored. Its disadvantages — they cannot be efficiently managed and controlled, and backup or migration of data is complicated.

Leave a Reply

Your email address will not be published. Required fields are marked *