Skip to content

99% of the time you don't need a database. Please stop wasting your time. Learn from my pain.

Because of popular opinion in the programming ecosystem, almost all developers believe that you need to use a database offering like MySQL or MongoDB. A smaller subset of informed individuals might opt for a more simple database solution like SQLlite or UnQLite. This is a foolish belief system that i have fallen victim to myself. Let me explain why you most likely don't need a database.

You don't need the complexity to your small project

Many times in my past, i have been either working on a internal tool or creating a small personal project and the need to store persistent data has come about. Naturally the idea of using a SQL/NoSQL type of database is the first thing that comes to mind. Let me help you realize that adds complexity to your project you don't even think about. Regardless of the architecture both of these types of databases require a dedicated service (typically a windows service or SystemD service depending on the operating system) to be run and this comes with a cost. Furthermore, a SQL/NoSQL service could have issues and it's not obvious that your programming language wrapper to these services can interface with these issues such as dealing with a database lock. I have failed a job interview because i did not know how to handle a database lock in mysql when using python's wrapper. (i literally footgunned myself and could of done something more simple like mentioned below 😭).

via GIPHY

Please stop paying for microservices

The next idea is offloading data to a microservice like AWS Dynamodb or MongoDB cloud, but the downside is that it costs money. Every company i have worked for wants to save as much money as possible and the bosses in some cases are too stingy to even pay $1 a month to store data in MongoDB cloud. It's a popular paradigm to use a service on a cloud provider to ingest your data, as sadly the expected default behavior is to use a microservice as seen in popular frameworks like next.js (you are paying for first class developer experience at the cost of locking into microservices). Relying on microservices' is a luxury and sadly not many people can do so (because their bosses are cheap).

via GIPHY

The Most simple solution is working with the programming language's version of JSON as persistant JSON

This might sound very bizzare but here me out. Look how simple this is to use:

>>> import shelve
>>> db = shelve.open('myshelf.db',writeback=True)
>>> db['servers']={'home' : ['10.20.0.100','10.20.0.101']}
>>> db['servers']['home']
['10.20.0.100', '10.20.0.101']
>>> db['servers']['home'].append('10.20.0.102')
>>> db['servers']['home']
['10.20.0.100', '10.20.0.101', '10.20.0.102']
>>> db['servers']['school'] = ['192.168.2.30']
>>> db['servers']
{'home': ['10.20.0.100', '10.20.0.101', '10.20.0.102'], 'school': ['192.168.2.30']}
>>> db.close()

The shelve module in Python is a convenient choice for small applications requiring a simple key-value store (same architure as noSQL). It allows you to persistently store Python objects in a dictionary-like format, making it easy to save and retrieve data using familiar dictionary operations. Since shelve works natively with Python’s data types, it integrates seamlessly into your code without the need for complex setup or external dependencies. This simplicity and ease of use make shelve an excellent option for lightweight data storage needs. For ACID compliance, consider this.

The equivlent concept is javascript is the use of the lowdb package:

const db = await JSONFilePreset('db.json', { posts: [] })
const post = { id: 1, title: 'lowdb is awesome', views: 100 }

// In two steps
db.data.posts.push(post)
await db.write()

// Or in one
await db.update(({ posts }) => posts.push(post))

There are various reasons you should decide to use lowdb instead of a conventional database. Perfect for tiny projects and experimentation, lowdb is a flexible and light JSON-based database. It is ideal for situations when simplicity and convenience of usage, since it removes the overhead of building and running a fully functional database server. lowdb lets you easily interface with Node.js apps and execute CRUD operations using known JavaScript syntax.

Key-Value architecture is ideal for rapid development projects

Using a NoSQL architecture for your small project can be highly advantageous if you don’t have a fixed schema and anticipate frequent changes. NoSQL architecture is schema-less, meaning they allow you to store data without defining a rigid structure upfront. This flexibility is particularly useful during the early stages of a project when requirements are still evolving and you need to iterate quickly.

With a NoSQL architecture, you can easily add, remove, or modify fields without having to perform complex migrations or schema updates. This can save a lot of development time and reduce the risk of errors. Additionally, NoSQL architecture often provide better performance for certain types of queries and can handle large volumes of data more efficiently, making them a robust choice for dynamic and rapidly changing applications.

My new belief system when it comes to databases

Based on the amount of users your application has, you should adjust your database plan. Simply starting off using Python's shelve or JavaScript's lowdb is a fine place to start. When you get more users and need things like ACID, you then can do a near seamless migration to the higher grade NoSQL offering. Here's a breakdown of my belif for choosing databases based on the number of users:

1. Less than 100 Users: Persistent JSON

  • Technology Choices: Python's shelve or JavaScript's lowdb.
  • Reasoning: For small-scale applications, where user count is minimal, using a simple, lightweight solution like persistent JSON is ideal. These tools are easy to set up, require minimal configuration, and are well-suited for prototypes, small projects, or personal applications.
  • Benefits:
    • Quick setup and minimal overhead.
    • No need for complex database management.
    • Ideal for applications that don't require high concurrency or complex transactions.

2. Between 100 and 20,000 Users: ACID-Compliant Database

  • Technology Choice: DictDataBase.
  • Reasoning: As your user base grows, data integrity and consistency become crucial. Upgrading to an ACID-compliant database ensures that transactions are handled reliably, and data remains consistent even in the face of errors or crashes. DictDataBase offers a schema similar to your initial setup but with the added robustness of ACID properties.
  • Benefits:
    • Ensures data consistency and integrity.
    • Suitable for applications with moderate traffic and more complex data handling needs.
    • Provides a smooth transition from the simpler JSON-based system.

3. More than 20,000 Users: High-Availability and Distributed Database

  • Technology Choices: MongoDB or Riak.
  • Reasoning: At this scale, the application requires a database that can handle high traffic, ensure availability, and scale horizontally. MongoDB and Riak are both well-suited for distributed environments, offering high availability, fault tolerance, and seamless integration with your existing architecture.
  • Benefits:
    • Handles large volumes of data and user requests efficiently.
    • Supports horizontal scaling, allowing the system to grow as needed.
    • Maintains data availability even in distributed or multi-node setups.

This approach ensures that your database solution evolves alongside your application, maintaining efficiency, reliability, and scalability at each stage of growth.

SQL is Ideal with a clear vision

SQL databases are ideal when you have a clear schema and vision for your program from the start, offering strong data integrity and efficient querying. However, their rigidity can become a drawback as projects evolve, requiring changes to the schema. These changes can break compatibility with existing code, leading to complex migrations and maintenance challenges. While SQL excels in stable environments, its inflexibility makes it less suited for projects where the schema is likely to change frequently.