1. What is NoSQL?
NoSQL stands for "Not only SQL." It's a category of non-relational databases
designed to handle vast amounts of unstructured or semi-structured data. Unlike
traditional relational databases (SQL), NoSQL databases do not use a fixed
schema and are highly flexible, scalable, and performant.
2. What are the key differences between SQL and NoSQL databases?
Feature SQL (Relational) NoSQL (Non-Relational)
Fixed, predefined
Schema Dynamic schema or schema-less.
schema.
Primarily vertical (scaling
Scalability Primarily horizontal (scaling out).
up).
Data Tabular with rows and Document, key-value, wide-column, or
Model columns. graph.
ACID vs. Follows ACID properties Follows BASE properties for availability and
BASE for transactions. scalability.
Generally does not support joins;
Joins Supports complex joins.
denormalization is preferred.
Ideal for structured, Ideal for large, unstructured data, big data,
Use Case
transactional data. and real-time web applications.
Export to Sheets
3. Explain the CAP Theorem.
The CAP Theorem (or Brewer's Theorem) states that a distributed data store
can only provide two of three guarantees simultaneously:
Consistency: Every read receives the most recent write or an error.
Availability: Every request receives a non-error response, but without the
guarantee that it's the most recent data.
Partition Tolerance: The system continues to operate despite an
arbitrary number of messages being dropped or delayed by the network
between nodes.
Most NoSQL databases are "AP" (Available and Partition Tolerant), prioritizing
scalability and availability over strong consistency.
Licensed by Google
4. What is Eventual Consistency?
Eventual consistency is a consistency model used in distributed NoSQL
databases. It guarantees that if no new updates are made to a given data item,
all accesses to that item will eventually return the last updated value. This model
prioritizes availability and partition tolerance over immediate consistency.
5. What are the main types of NoSQL databases?
The four main types are:
Key-Value Stores: Data is stored as a collection of key-value pairs.
Simple and fast. (e.g., Redis, DynamoDB).
Document Databases: Store data in flexible, semi-structured documents
(like JSON or BSON). Ideal for content management and catalogs. (e.g.,
MongoDB, Couchbase).
Wide-Column Stores: Store data in columns, rather than rows, allowing
for efficient access to specific data points. Excellent for large-scale data
analytics. (e.g., Cassandra, HBase).
Graph Databases: Store data in nodes and edges to represent
relationships. Perfect for social networks, recommendation engines, and
fraud detection. (e.g., Neo4j, Amazon Neptune).
6. When would you use a NoSQL database over a relational one?
Choose NoSQL when:
You are dealing with large volumes of unstructured or semi-
structured data.
You need high scalability and high availability for web-scale
applications.
Your application has a rapidly evolving schema that is not fixed.
You need fast read/write operations and are willing to sacrifice strong
consistency.
7. What is BASE?
BASE is a set of properties that NoSQL databases often follow, in contrast to the
ACID properties of relational databases. It stands for:
Basically Available: The system is guaranteed to be available.
Soft state: The state of the system can change over time even without
new input due to eventual consistency.
Eventually consistent: The system will become consistent over time.
8. Explain data modeling in a NoSQL database.
NoSQL data modeling is fundamentally different from relational modeling. It's
often application-centric or query-driven. Instead of normalizing data to
avoid redundancy, NoSQL databases often denormalize data by embedding
related information within a single document or record. This minimizes the need
for joins and optimizes read performance.
9. What is Sharding?
Sharding is a method for distributing data across multiple machines or servers
(shards) to achieve horizontal scalability. Each shard holds a portion of the entire
dataset. This allows a database to handle much larger datasets and higher traffic
loads than a single server could.
10. How do you handle relationships in a NoSQL database?
Since NoSQL databases typically don't support traditional joins, relationships are
handled differently:
Embedding: Storing related documents within a single document. Best
for one-to-one or one-to-few relationships. This is the fastest method for
retrieval.
Referencing: Storing a foreign key (like an ID) to another document. This
is similar to a foreign key in a relational database and is used for many-to-
many relationships.
11. What is an aggregate?
In NoSQL, an aggregate is a collection of related data items that are treated as a
single unit. For example, in a document database, a document is an aggregate.
This concept is central to data modeling, as operations should ideally be
contained within a single aggregate to maintain consistency without the need for
complex transactions.
12. What is a "collection" in MongoDB, and how does it compare to a
"table" in SQL?
A collection in MongoDB is a group of documents. It's similar to a table in a
relational database, but it's schema-less, meaning documents within the same
collection can have different fields and data types.
13. How does indexing work in NoSQL databases?
Indexing in NoSQL is similar to SQL; it creates data structures to speed up query
performance. However, due to the flexible schema, indexes can be created on
different fields for different documents within the same collection. Compound
indexes on multiple fields and text indexes for full-text search are also
common.
14. What are the advantages of using a document database like
MongoDB?
Flexible Schema: Allows for rapid and agile development.
Rich Queries: Supports a wide range of queries, including embedded
documents and arrays.
Horizontal Scalability: Easily scales out by adding more servers.
Developer-Friendly: Uses JSON/BSON format, which is familiar to
developers.
15. What are the trade-offs of using a NoSQL database?
Lack of standardized query language: Each NoSQL database has its
own query language or API.
Eventual consistency: May not be suitable for applications requiring
strict data consistency (e.g., banking).
Less mature ecosystem: Can have fewer tools and community support
compared to SQL.
No schema enforcement: Can lead to data inconsistencies and
application-level complexity.
16. What is a replica set in MongoDB?
A replica set is a group of MongoDB processes that maintain the same data set.
It provides high availability and data redundancy. If the primary node fails, an
election is held, and one of the secondary nodes is promoted to primary.
17. How is data stored in a Key-Value database?
Data is stored as a unique key pointing to a value. The value can be a simple
data type (e.g., string, number) or a complex object. It is the simplest NoSQL
data model, providing extremely fast read and write access for a given key.
18. What are some use cases for a Graph Database?
Graph databases are used for applications where relationships are the most
important part of the data. Examples include:
Social networks: Finding friends of friends.
Fraud detection: Identifying complex relationships between fraudulent
transactions.
Recommendation engines: Suggesting products or content based on
user connections.
19. How do NoSQL databases handle transactions?
Traditionally, NoSQL databases prioritized performance and scalability over multi-
document transactions. However, many modern NoSQL databases (like MongoDB
4.0+) now support ACID-compliant transactions across multiple documents
within a single replica set. This provides a balance between the flexibility of
NoSQL and the strong guarantees of SQL.
20. Explain the concept of "aggregate-oriented" databases.
NoSQL databases are often called "aggregate-oriented" because they store and
operate on aggregates (like documents or key-value pairs) rather than individual
data points. This simplifies data access by grouping related data together, which
is crucial for achieving high performance at scale.