Mongo dB
Course Basic Information:
Course Code: BDS456B
Semester: 4
Total Hours: 24
Credits: 1
Marks Distribution: CIE Marks (50) + SEE Marks (50) = Total Marks (100)
Course Objectives:
1. Understanding basic MongoDB functions, operators and operations
2. Learning about Indexing and Advanced Indexing
3. Applying aggregation and Map Reduction
4. Demonstrating text searching on collections
Experiments/Labs (10 in total):
1. WHERE clause operations and basic MongoDB commands (Insert, Query, Update, Delete)
2. Field selection and limiting results
3. Query selectors (comparison, logical, geospatial, bitwise)
4. Projection operators ($, $elematch, $slice)
5. Aggregation operations ($avg, $min, $max, $push, $addToSet)
6. Aggregation Pipeline operations ($match, $group, $sort, $project, $skip)
7. Queries for listings and reviews in e-commerce
8. Index creation and query optimization
9. Text search implementation
10. Text search using aggregation pipeline
Course Outcomes:
1. Using MongoDB commands and queries
2. Understanding aggregate pipelines for data extraction
3. Query optimization through indexes
4. Developing aggregate pipelines for text search
What is MongoDB?
MongoDB is a popular open-source NoSQL database that stores data in a flexible, JSON-like format
called BSON (Binary JSON). Unlike traditional relational databases, MongoDB is document-oriented,
meaning it stores data in collections of documents rather than tables with rows and columns.
Key Features
1. Document-Oriented Storage
o Stores data in flexible JSON-like documents
o Fields can vary between documents
o No need for predefined schema
2. High Performance
o Built-in support for horizontal scaling (sharding)
o Indexed support including text, geospatial, and compound indexes
o Fast in-place updates
3. High Availability
o Automatic failover
o Built-in replication
o Multi-datacenter deployment support
4. Horizontal Scalability
o Automatic sharding for distributed data
o Load balancing
o Automatic data distribution
Benefits
1. Flexible Schema
o Adapt to changing requirements without redesigning database
o Handle diverse data types
o Evolve applications faster
2. Scalability
o Scale horizontally across machines
o Handle increasing data volumes
o Maintain performance under load
3. Performance
o Fast read/write operations
o Efficient querying with indexes
o Good for real-time analytics
4. Developer Friendly
o Natural data representation
o Rich query language
o Strong driver ecosystem
Why MongoDB is Required Today
1. Modern Application Needs
o Handles unstructured and semi-structured data
o Perfect for agile development
o Supports rapid iterations
2. Big Data Management
o Efficiently handles large volumes of data
o Scales easily as data grows
o Good for real-time data processing
3. Cloud-Native Applications
o Built for distributed systems
o Excellent for microservices architecture
o Easy cloud deployment
4. Digital Transformation
o Supports modern digital experiences
o Handles diverse data types
o Enables faster innovation
5. Use Cases
o Content Management Systems
o Real-time Analytics
o IoT Applications
o Mobile Applications
o Customer Data Platforms
Now, would you like me to proceed with the detailed explanation of each lab experiment with
examples? I'll start with Lab 1 which covers basic MongoDB operations, and we can go through each
one systematically.
MongoDB Installation
1. For Windows:
# Download MongoDB Community Server from:
https://www.mongodb.com/try/download/community
# Install MongoDB Compass (GUI Tool) from:
https://www.mongodb.com/try/download/compass
2. For Ubuntu/Linux:
# Import MongoDB public GPG key
wget -qO - https://www.mongodb.org/static/pgp/server-6.0.asc | sudo apt-key add -
# Create list file for MongoDB
echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu focal/mongodb-org/6.0
multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-6.0.list
# Update package list
sudo apt-get update
# Install MongoDB
sudo apt-get install -y mongodb-org
# Start MongoDB
sudo systemctl start mongod
# Enable MongoDB on system startup
sudo systemctl enable mongod
3. For Mac:
# Using Homebrew
brew tap mongodb/brew
brew install mongodb-community
# Start MongoDB
brew services start mongodb-community
Verify Installation:
# Check MongoDB version
mongod --version
# Connect to MongoDB shell
mongosh
Now let's start with Lab 1:
Lab 1: Basic MongoDB Operations
Part A: WHERE Clause, AND, OR Operations
1. First, create a database and collection:
// Create/Switch to database
use training_db
// Create a collection and insert sample data
db.products.insertMany([
name: "Laptop",
price: 999,
category: "Electronics",
inStock: true
},
name: "Smartphone",
price: 699,
category: "Electronics",
inStock: false
},
name: "Desk Chair",
price: 199,
category: "Furniture",
inStock: true
])
2. WHERE Clause Examples:
// Simple where clause
db.products.find({ category: "Electronics" })
// AND operation
db.products.find({
category: "Electronics",
inStock: true
})
// OR operation
db.products.find({
$or: [
{ category: "Electronics" },
{ price: { $lt: 200 } }
})
Part B: Basic CRUD Operations
1. Insert Operations:
// Insert single document
db.products.insertOne({
name: "Monitor",
price: 299,
category: "Electronics",
inStock: true
})
// Insert multiple documents
db.products.insertMany([
name: "Keyboard",
price: 49,
category: "Electronics"
},
name: "Mouse",
price: 29,
category: "Electronics"
])
2. Query Operations:
// Find all documents
db.products.find()
// Find with condition
db.products.find({ price: { $gt: 100 } })
// Find one document
db.products.findOne({ name: "Laptop" })
3. Update Operations:
// Update single document
db.products.updateOne(
{ name: "Laptop" },
{ $set: { price: 899 } }
// Update multiple documents
db.products.updateMany(
{ category: "Electronics" },
{ $inc: { price: 10 } }
4. Delete Operations:
// Delete single document
db.products.deleteOne({ name: "Mouse" })
// Delete multiple documents
db.products.deleteMany({ category: "Furniture" })
5. Projection (Select specific fields):
// Include specific fields (1 = include)
db.products.find({}, { name: 1, price: 1 })
// Exclude specific fields (0 = exclude)
db.products.find({}, { inStock: 0 })
Let's dive into Lab 2, which focuses on field selection and limiting results.
Lab 2: Field Selection and Document Limiting
Part A: Selecting Specific Fields (Projection)
Let's first create a sample collection with more complex data:
// Create a sample users collection
db.users.insertMany([
name: "John Doe",
email: "john@example.com",
age: 28,
address: {
street: "123 Main St",
city: "New York",
country: "USA"
},
interests: ["coding", "reading", "travel"],
subscription: {
plan: "premium",
status: "active"
},
name: "Jane Smith",
email: "jane@example.com",
age: 32,
address: {
street: "456 Park Ave",
city: "Boston",
country: "USA"
},
interests: ["photography", "music"],
subscription: {
plan: "basic",
status: "active"
// Add more sample data...
])
Now let's look at different ways to select specific fields:
1. Include Specific Fields:
// Select only name and email
db.users.find({}, { name: 1, email: 1 })
// Select nested fields
db.users.find({}, {
name: 1,
'address.city': 1,
'subscription.plan': 1
})
2. Exclude Specific Fields:
// Exclude certain fields
db.users.find({}, {
subscription: 0,
address: 0
})
// Exclude specific nested fields
db.users.find({}, {
'address.street': 0,
'subscription.status': 0
})
3. Array Field Projections:
// Get first element of interests array
db.users.find({}, {
name: 1,
interests: { $slice: 1 }
})
// Get last two elements of interests array
db.users.find({}, {
name: 1,
interests: { $slice: -2 }
})
Part B: Limiting Results
1. Basic Limit Operation:
// Get first 5 documents
db.users.find().limit(5)
// Combine with projection
db.users.find({}, {
name: 1,
email: 1
}).limit(5)
2. Skip and Limit (Pagination):
// Skip first 2 documents and get next 3
db.users.find().skip(2).limit(3)
// With conditions and projection
db.users.find(
{ age: { $gt: 25 } },
{ name: 1, age: 1 }
).skip(2).limit(3)
3. Sorting with Limit:
// Get 5 youngest users
db.users.find()
.sort({ age: 1 })
.limit(5)
// Get 3 most recent premium users
db.users.find({ 'subscription.plan': 'premium' })
.sort({ _id: -1 })
.limit(3)
4. Practical Examples:
// Example 1: Get names of 5 users in Boston
db.users.find(
{ 'address.city': 'Boston' },
{ name: 1, _id: 0 }
).limit(5)
// Example 2: Get email and subscription plan of active users
db.users.find(
{ 'subscription.status': 'active' },
email: 1,
'subscription.plan': 1
).limit(10)
// Example 3: Get users with pagination and age filter
const pageSize = 5;
const pageNumber = 2;
db.users.find(
{ age: { $gt: 25 } },
{ name: 1, age: 1, email: 1 }
.skip((pageNumber - 1) * pageSize)
.limit(pageSize)
5. Combining Everything:
// Complex query example
db.users.find(
age: { $gte: 25, $lte: 35 },
'subscription.status': 'active'
},
name: 1,
email: 1,
'address.city': 1,
interests: { $slice: 2 }
.sort({ age: -1 })
.skip(5)
.limit(10)
Key Points to Remember:
1. You can't mix inclusion (1) and exclusion (0) in the same projection, except for _id
2. The _id field is always included unless explicitly excluded
3. Skip() should be used carefully with large datasets as it can be inefficient
4. Limit() is useful for pagination and controlling response size
Let's explore Lab 3, which focuses on different types of query selectors in MongoDB.
Lab 3: Query Selectors
Part A: Comparison and Logical Selectors
First, let's create a sample collection to work with:
// Create products collection with diverse data
db.products.insertMany([
name: "Laptop Pro",
price: 1299.99,
stock: 50,
rating: 4.5,
tags: ["electronics", "computers"],
specs: {
ram: 16,
storage: 512,
processor: "i7"
},
lastUpdated: new Date("2024-01-15")
},
name: "Gaming Mouse",
price: 79.99,
stock: 100,
rating: 4.8,
tags: ["electronics", "gaming"],
specs: {
dpi: 16000,
buttons: 8
},
lastUpdated: new Date("2024-02-01")
// ... more sample data
])
1. Comparison Selectors:
// $eq - Equal to
db.products.find({ price: { $eq: 1299.99 } })
// $ne - Not equal to
db.products.find({ price: { $ne: 79.99 } })
// $gt - Greater than
db.products.find({ rating: { $gt: 4.0 } })
// $gte - Greater than or equal to
db.products.find({ stock: { $gte: 50 } })
// $lt - Less than
db.products.find({ price: { $lt: 1000 } })
// $lte - Less than or equal to
db.products.find({ rating: { $lte: 4.5 } })
// $in - Match any value in array
db.products.find({ price: { $in: [79.99, 1299.99] } })
// $nin - Not match any value in array
db.products.find({ tags: { $nin: ["gaming"] } })
2. Logical Selectors:
// $and - All conditions must match
db.products.find({
$and: [
{ price: { $gt: 100 } },
{ rating: { $gte: 4.0 } }
})
// $or - Any condition can match
db.products.find({
$or: [
{ stock: { $lt: 20 } },
{ rating: { $gt: 4.5 } }
})
// $not - Negates the condition
db.products.find({
price: { $not: { $gt: 1000 } }
})
// $nor - None of the conditions should match
db.products.find({
$nor: [
{ stock: { $lt: 10 } },
{ rating: { $lt: 3.0 } }
})
// Complex combinations
db.products.find({
$and: [
$or: [
{ price: { $lt: 100 } },
{ rating: { $gt: 4.5 } }
},
{ stock: { $gt: 0 } }
})
Part B: Geospatial and Bitwise Selectors
1. Geospatial Selectors:
// First, create a collection with location data
db.locations.insertMany([
name: "Coffee Shop",
location: {
type: "Point",
coordinates: [-73.97, 40.77] // [longitude, latitude]
},
name: "Restaurant",
location: {
type: "Point",
coordinates: [-73.96, 40.78]
])
// Create a 2dsphere index
db.locations.createIndex({ location: "2dsphere" })
// $near - Find locations near a point
db.locations.find({
location: {
$near: {
$geometry: {
type: "Point",
coordinates: [-73.97, 40.77]
},
$maxDistance: 1000 // in meters
})
// $geoWithin - Find locations within a polygon
db.locations.find({
location: {
$geoWithin: {
$geometry: {
type: "Polygon",
coordinates: [[
[-74.0, 40.7],
[-73.9, 40.7],
[-73.9, 40.8],
[-74.0, 40.8],
[-74.0, 40.7]
]]
})
2. Bitwise Selectors:
// Create a collection with binary flags
db.flags.insertMany([
{ name: "Flag1", bits: 4 }, // 100 in binary
{ name: "Flag2", bits: 7 } // 111 in binary
])
// $bitsAllSet - All bits are 1
db.flags.find({
bits: { $bitsAllSet: 3 } // 011 in binary
})
// $bitsAnySet - Any bit is 1
db.flags.find({
bits: { $bitsAnySet: 2 } // 010 in binary
})
// $bitsAllClear - All bits are 0
db.flags.find({
bits: { $bitsAllClear: 8 } // 1000 in binary
})
// $bitsAnyClear - Any bit is 0
db.flags.find({
bits: { $bitsAnyClear: 2 } // 010 in binary
})
Practical Examples:
// Find products with specific price range and tags
db.products.find({
$and: [
{ price: { $gte: 100, $lte: 1500 } },
{ tags: { $in: ["electronics"] } },
{ stock: { $gt: 0 } }
})
// Find locations within 5km of a point and with specific attributes
db.locations.find({
$and: [
location: {
$near: {
$geometry: {
type: "Point",
coordinates: [-73.97, 40.77]
},
$maxDistance: 5000
},
{ rating: { $gte: 4.0 } }
})
Let's explore Lab 4, which focuses on Projection Operators in MongoDB. These operators help us
retrieve specific elements from arrays and embedded documents.
Lab 4: Projection Operators ($, $elemMatch, and $slice)
Let's start by creating a sample collection with complex data:
// Create a collection with nested arrays and documents
db.courses.insertMany([
courseName: "Web Development",
instructor: "John Doe",
students: [
name: "Alice",
grades: [85, 92, 88, 95],
projects: [
{ name: "Project 1", score: 90 },
{ name: "Project 2", score: 95 }
},
name: "Bob",
grades: [78, 88, 92, 85],
projects: [
{ name: "Project 1", score: 85 },
{ name: "Project 2", score: 88 }
],
modules: [
{ name: "HTML", duration: "2 weeks" },
{ name: "CSS", duration: "2 weeks" },
{ name: "JavaScript", duration: "4 weeks" }
},
// Add more courses...
])
1. The $ Operator
Used to project the first element that matches the query condition:
// Find the first student who scored 92 in grades
db.courses.find(
{ "students.grades": 92 },
courseName: 1,
"students.$": 1
// Find specific project details
db.courses.find(
{ "students.projects.name": "Project 1" },
courseName: 1,
"students.projects.$": 1
)
2. The $elemMatch Operator
Returns the first element that matches the specified condition:
// Find students with grades above 90
db.courses.find({}, {
courseName: 1,
students: {
$elemMatch: {
grades: { $gt: 90 }
})
// Find students with high project scores
db.courses.find({}, {
students: {
$elemMatch: {
"projects.score": { $gte: 95 }
})
// Complex $elemMatch example
db.courses.find({}, {
students: {
$elemMatch: {
grades: { $gt: 90 },
"projects.score": { $gte: 85 }
})
3. The $slice Operator
Controls the number of elements to return from an array:
// Get first two modules
db.courses.find({}, {
courseName: 1,
modules: { $slice: 2 }
})
// Get last two grades of each student
db.courses.find({}, {
courseName: 1,
"students.grades": { $slice: -2 }
})
// Skip and limit array elements
db.courses.find({}, {
courseName: 1,
modules: { $slice: [1, 2] } // Skip 1, take 2
})
Practical Examples:
1. Combined Projection Operators:
// Complex query with multiple projections
db.courses.find(
{ "students.grades": { $gte: 90 } },
courseName: 1,
instructor: 1,
students: {
$elemMatch: {
grades: { $gte: 90 },
"projects.score": { $gt: 85 }
}
},
modules: { $slice: 2 }
2. Real-world Scenario - Course Report:
// Generate a focused course report
db.courses.find({}, {
courseName: 1,
instructor: 1,
// Get only top performing students
students: {
$elemMatch: {
grades: { $gt: 90 }
},
// Get only recent modules
modules: { $slice: -2 }
})
3. Student Performance Analysis:
// Analyze specific student performances
db.courses.find(
{ "students.name": "Alice" },
courseName: 1,
"students.$": 1,
// Include only their most recent grades
"students.grades": { $slice: -2 },
// Include only their latest project
"students.projects": { $slice: -1 }
}
)
4. Advanced Usage with Conditions:
// Complex report generation
db.courses.find({
"students.projects.score": { $gte: 90 }
}, {
courseName: 1,
students: {
$elemMatch: {
"projects.score": { $gte: 90 },
grades: { $exists: true }
},
"students.projects": { $slice: 1 },
"students.grades": { $slice: -2 },
modules: { $slice: [0, 2] }
})
Important Points to Remember:
1. $ returns only the first matching element
2. $elemMatch is useful for filtering embedded documents
3. $slice can be used with positive, negative numbers, or [skip, limit]
4. These operators can be combined for complex projections
5. Projections can significantly reduce the amount of data transferred
Let's explore Lab 5, which focuses on Aggregation Operations in MongoDB. These operations allow
us to perform complex data analysis and transformations.
Lab 5: Aggregation Operations
First, let's create a sample dataset to work with:
// Create a sales collection with detailed data
db.sales.insertMany([
{
date: new Date("2024-01-15"),
product: "Laptop",
category: "Electronics",
price: 1200,
quantity: 3,
customer: {
name: "John Doe",
city: "New York"
},
payment: "Credit Card"
},
date: new Date("2024-01-15"),
product: "Mouse",
category: "Electronics",
price: 25,
quantity: 10,
customer: {
name: "Jane Smith",
city: "Boston"
},
payment: "Cash"
// ... more sample data
])
1. Basic Aggregation Operations
$avg (Average):
// Calculate average price
db.sales.aggregate([
{
$group: {
_id: null,
averagePrice: { $avg: "$price" }
])
// Average price by category
db.sales.aggregate([
$group: {
_id: "$category",
avgPrice: { $avg: "$price" },
avgQuantity: { $avg: "$quantity" }
])
$min and $max:
// Find price ranges by category
db.sales.aggregate([
$group: {
_id: "$category",
minPrice: { $min: "$price" },
maxPrice: { $max: "$price" },
priceRange: {
$subtract: [
{ $max: "$price" },
{ $min: "$price" }
}
}
])
$push:
// Collect all products by category
db.sales.aggregate([
$group: {
_id: "$category",
products: { $push: "$product" }
])
// Push complete product details
db.sales.aggregate([
$group: {
_id: "$category",
productDetails: {
$push: {
name: "$product",
price: "$price",
quantity: "$quantity"
])
$addToSet:
// Unique customers by city
db.sales.aggregate([
$group: {
_id: "$customer.city",
uniqueCustomers: { $addToSet: "$customer.name" }
])
// Unique payment methods by category
db.sales.aggregate([
$group: {
_id: "$category",
paymentMethods: { $addToSet: "$payment" }
])
2. Complex Aggregation Examples
Sales Analysis:
// Comprehensive sales analysis
db.sales.aggregate([
$group: {
_id: {
category: "$category",
city: "$customer.city"
},
totalSales: { $sum: { $multiply: ["$price", "$quantity"] } },
averageQuantity: { $avg: "$quantity" },
uniqueCustomers: { $addToSet: "$customer.name" },
transactions: { $sum: 1 }
},
$sort: { totalSales: -1 }
])
Product Performance:
// Product performance metrics
db.sales.aggregate([
$group: {
_id: "$product",
totalRevenue: { $sum: { $multiply: ["$price", "$quantity"] } },
totalQuantity: { $sum: "$quantity" },
averagePrice: { $avg: "$price" },
salesCount: { $sum: 1 },
cities: { $addToSet: "$customer.city" }
},
$project: {
_id: 1,
totalRevenue: 1,
totalQuantity: 1,
averagePrice: 1,
salesCount: 1,
cityCount: { $size: "$cities" }
])
Time-based Analysis:
// Daily sales summary
db.sales.aggregate([
$group: {
_id: {
date: { $dateToString: { format: "%Y-%m-%d", date: "$date" } }
},
dailyRevenue: { $sum: { $multiply: ["$price", "$quantity"] } },
distinctProducts: { $addToSet: "$product" },
transactionCount: { $sum: 1 },
averageTransactionValue: { $avg: { $multiply: ["$price", "$quantity"] } }
},
$sort: { "_id.date": 1 }
])
Customer Analysis:
// Customer purchasing patterns
db.sales.aggregate([
$group: {
_id: "$customer.name",
totalSpent: { $sum: { $multiply: ["$price", "$quantity"] } },
purchasedCategories: { $addToSet: "$category" },
purchasedProducts: { $push: {
product: "$product",
date: "$date",
amount: { $multiply: ["$price", "$quantity"] }
}},
averageTransactionValue: { $avg: { $multiply: ["$price", "$quantity"] } }
},
$sort: { totalSpent: -1 }
])
Important Points to Remember:
1. Aggregation operations are performed in stages
2. $group is often used with other aggregation operators
3. $push includes duplicates while $addToSet removes duplicates
4. Complex calculations can be done using multiple operators
5. Results can be further processed using additional pipeline stages
Let's dive into Lab 6, which focuses on Aggregation Pipeline operations in MongoDB. The pipeline
allows us to perform complex transformations and analysis on our data in stages.
Lab 6: Aggregation Pipeline Operations
First, let's create a sample dataset to work with:
// Create an orders collection with rich data
db.orders.insertMany([
orderId: "ORD001",
customerName: "John Smith",
orderDate: new Date("2024-01-15"),
items: [
{ product: "Laptop", price: 1200, quantity: 1 },
{ product: "Mouse", price: 25, quantity: 2 }
],
shippingAddress: {
city: "New York",
country: "USA"
},
status: "delivered",
paymentMethod: "credit_card"
},
orderId: "ORD002",
customerName: "Jane Doe",
orderDate: new Date("2024-01-16"),
items: [
{ product: "Monitor", price: 300, quantity: 2 },
{ product: "Keyboard", price: 50, quantity: 1 }
],
shippingAddress: {
city: "Boston",
country: "USA"
},
status: "processing",
paymentMethod: "paypal"
// ... more sample data
])
1. Basic Pipeline Operations
$match Stage:
// Filter orders by status
db.orders.aggregate([
$match: {
status: "delivered",
"shippingAddress.country": "USA"
}
])
// Match with date range
db.orders.aggregate([
$match: {
orderDate: {
$gte: new Date("2024-01-01"),
$lt: new Date("2024-02-01")
])
$group Stage:
// Group by city with counts
db.orders.aggregate([
$group: {
_id: "$shippingAddress.city",
totalOrders: { $sum: 1 },
uniqueCustomers: { $addToSet: "$customerName" }
])
// Group by payment method with revenue
db.orders.aggregate([
$unwind: "$items"
},
$group: {
_id: "$paymentMethod",
totalRevenue: {
$sum: { $multiply: ["$items.price", "$items.quantity"] }
])
$sort Stage:
// Sort orders by date
db.orders.aggregate([
$sort: {
orderDate: -1 // Descending order
},
$limit: 5
])
$project Stage:
// Reshape the output
db.orders.aggregate([
$project: {
_id: 0,
orderNumber: "$orderId",
customer: "$customerName",
orderValue: {
$sum: {
$map: {
input: "$items",
as: "item",
in: { $multiply: ["$$item.price", "$$item.quantity"] }
])
2. Complex Pipeline Examples
Order Analysis Pipeline:
// Comprehensive order analysis
db.orders.aggregate([
// Match orders from specific period
$match: {
orderDate: {
$gte: new Date("2024-01-01"),
$lt: new Date("2024-02-01")
},
// Unwind items array
$unwind: "$items"
},
// Group by product
$group: {
_id: "$items.product",
totalQuantity: { $sum: "$items.quantity" },
totalRevenue: {
$sum: { $multiply: ["$items.price", "$items.quantity"] }
},
averageOrderValue: {
$avg: { $multiply: ["$items.price", "$items.quantity"] }
},
orders: { $addToSet: "$orderId" }
},
// Sort by revenue
$sort: { totalRevenue: -1 }
},
// Project final format
$project: {
product: "$_id",
totalQuantity: 1,
totalRevenue: 1,
averageOrderValue: { $round: ["$averageOrderValue", 2] },
numberOfOrders: { $size: "$orders" }
])
Customer Insights Pipeline:
// Customer purchasing analysis
db.orders.aggregate([
// Unwind items
{
$unwind: "$items"
},
// Group by customer
$group: {
_id: "$customerName",
totalSpent: {
$sum: { $multiply: ["$items.price", "$items.quantity"] }
},
orderCount: { $addToSet: "$orderId" },
productsBought: { $addToSet: "$items.product" },
cities: { $addToSet: "$shippingAddress.city" }
},
// Add computed fields
$project: {
customer: "$_id",
totalSpent: 1,
orderCount: { $size: "$orderCount" },
productCount: { $size: "$productsBought" },
shippingCities: "$cities",
averageOrderValue: {
$round: [{ $divide: ["$totalSpent", { $size: "$orderCount" }] }, 2]
},
// Sort by total spent
$sort: { totalSpent: -1 }
}
])
Geographic Analysis:
// Analysis by location
db.orders.aggregate([
$group: {
_id: {
city: "$shippingAddress.city",
country: "$shippingAddress.country"
},
orderCount: { $sum: 1 },
customers: { $addToSet: "$customerName" },
totalRevenue: {
$sum: {
$reduce: {
input: "$items",
initialValue: 0,
in: {
$add: [
"$$value",
{ $multiply: ["$$this.price", "$$this.quantity"] }
},
$project: {
location: "$_id",
orderCount: 1,
customerCount: { $size: "$customers" },
totalRevenue: 1,
averageOrderValue: {
$round: [{ $divide: ["$totalRevenue", "$orderCount"] }, 2]
},
$sort: { totalRevenue: -1 }
])
Let's explore Lab 7, which focuses on handling listings and reviews in an e-commerce context using
MongoDB. This lab demonstrates how to work with related data and complex queries.
Lab 7: E-commerce Listings and Reviews
First, let's create our sample collections:
javascript
Copy
// Create listings collection
db.listings.insertMany([
listing_url: "product/laptop-pro",
name: "Laptop Pro 2024",
price: 1299.99,
description: "High-performance laptop",
category: "Electronics",
host: {
id: "H001",
name: "TechStore",
picture_url: "https://example.com/host/techstore.jpg",
rating: 4.8
},
address: {
street: "123 Tech Avenue",
city: "San Francisco",
state: "CA",
country: "USA"
},
images: [
"https://example.com/laptop1.jpg",
"https://example.com/laptop2.jpg"
],
available: true
// ... more listings
])
// Create reviews collection
db.reviews.insertMany([
listing_id: "LP001",
reviewer: {
name: "John Smith",
id: "R001",
picture_url: "https://example.com/reviewers/john.jpg"
},
rating: 5,
comment: "Excellent product, fast delivery!",
date: new Date("2024-01-15"),
helpful_votes: 12,
images: ["https://example.com/review1.jpg"]
// ... more reviews
])
Part A: Listing Queries
1. Find Listings with Host Pictures:
// Basic query
db.listings.find(
{ "host.picture_url": { $exists: true } },
listing_url: 1,
name: 1,
"host.name": 1,
"host.picture_url": 1,
address: 1
// Advanced query with aggregation
db.listings.aggregate([
$match: {
"host.picture_url": { $exists: true, $ne: null },
available: true
},
$project: {
_id: 0,
listing_url: 1,
name: 1,
host_info: {
name: "$host.name",
picture: "$host.picture_url",
rating: "$host.rating"
},
location: {
city: "$address.city",
state: "$address.state"
},
$sort: { "host_info.rating": -1 }
])
2. Complex Listing Queries:
// Find listings with specific criteria
db.listings.aggregate([
$match: {
price: { $lt: 2000 },
"host.rating": { $gte: 4.5 },
available: true
},
$lookup: {
from: "reviews",
localField: "_id",
foreignField: "listing_id",
as: "reviews"
}
},
$addFields: {
average_rating: { $avg: "$reviews.rating" },
review_count: { $size: "$reviews" }
},
$project: {
name: 1,
price: 1,
host_info: "$host",
rating_summary: {
average: "$average_rating",
total_reviews: "$review_count"
},
images: { $slice: ["$images", 2] }
])
Part B: Review Analysis
1. Basic Review Summary:
// Get review summary for listings
db.reviews.aggregate([
$group: {
_id: "$listing_id",
average_rating: { $avg: "$rating" },
total_reviews: { $sum: 1 },
helpful_votes: { $sum: "$helpful_votes" },
reviews: {
$push: {
rating: "$rating",
comment: "$comment",
date: "$date"
},
$sort: { average_rating: -1 }
])
2. Detailed Review Analysis:
// Comprehensive review analysis
db.reviews.aggregate([
$match: {
date: {
$gte: new Date("2024-01-01")
},
$group: {
_id: "$listing_id",
review_metrics: {
average_rating: { $avg: "$rating" },
rating_distribution: {
$push: "$rating"
},
total_helpful_votes: { $sum: "$helpful_votes" },
review_count: { $sum: 1 }
},
recent_reviews: {
$push: {
reviewer: "$reviewer.name",
rating: "$rating",
comment: "$comment",
date: "$date",
helpful_votes: "$helpful_votes"
},
$lookup: {
from: "listings",
localField: "_id",
foreignField: "_id",
as: "listing_details"
},
$project: {
listing_name: { $arrayElemAt: ["$listing_details.name", 0] },
metrics: "$review_metrics",
recent_reviews: { $slice: ["$recent_reviews", 5] }
])
3. Advanced Analytics:
// Review trends and insights
db.reviews.aggregate([
$facet: {
"rating_summary": [
$group: {
_id: "$rating",
count: { $sum: 1 }
},
{ $sort: { _id: -1 } }
],
"monthly_trends": [
$group: {
_id: {
month: { $month: "$date" },
year: { $year: "$date" }
},
average_rating: { $avg: "$rating" },
review_count: { $sum: 1 }
},
{ $sort: { "_id.year": 1, "_id.month": 1 } }
],
"top_reviewers": [
$group: {
_id: "$reviewer.id",
reviewer_name: { $first: "$reviewer.name" },
review_count: { $sum: 1 },
avg_rating_given: { $avg: "$rating" }
},
{ $sort: { review_count: -1 } },
{ $limit: 5 }
])
Let's explore Lab 8, which focuses on Index Creation and Query Optimization in MongoDB. This is
crucial for improving database performance.
Lab 8: Index Creation and Query Optimization
First, let's create a sample collection with substantial data to demonstrate indexing effects:
// Create a products collection with sample data
db.products.insertMany([
sku: "LP1001",
name: "Gaming Laptop",
category: "Electronics",
price: 1299.99,
tags: ["gaming", "laptop", "electronics"],
specs: {
ram: "16GB",
processor: "i7",
storage: "1TB"
},
stock: 50,
supplier: {
name: "TechCorp",
rating: 4.8
},
lastUpdated: new Date()
// ... more products
])
Part A: Creating Different Types of Indexes
1. Single Field Index:
// Create basic index
db.products.createIndex({ name: 1 })
// Create index with options
db.products.createIndex(
{ sku: 1 },
{ unique: true, background: true }
2. Compound Index:
javascript
Copy
// Create compound index
db.products.createIndex(
{ category: 1, price: -1 }
// Compound index with options
db.products.createIndex(
{ supplier: 1, stock: -1 },
{ sparse: true }
)
3. Multikey Index (for arrays):
// Create index on array field
db.products.createIndex({ tags: 1 })
// Create index on nested array fields
db.products.createIndex({ "specs.features": 1 })
4. Text Index:
// Create text index
db.products.createIndex(
{ name: "text", description: "text" },
weights: {
name: 10,
description: 5
},
default_language: "english"
Part B: Query Optimization
1. Using explain() for Query Analysis:
// Analyze query execution
db.products.find({ category: "Electronics" })
.explain("executionStats")
// Analyze sort operation
db.products.find()
.sort({ price: -1 })
.explain("allPlansExecution")
2. Index Optimization Examples:
// Before index
db.products.find({
category: "Electronics",
price: { $gt: 1000 }
}).explain("executionStats")
// Create optimal index
db.products.createIndex(
{ category: 1, price: 1 }
// After index - check improvement
db.products.find({
category: "Electronics",
price: { $gt: 1000 }
}).explain("executionStats")
3. Covered Queries:
// Create index for covered query
db.products.createIndex(
{ sku: 1, name: 1, price: 1 }
// Covered query example
db.products.find(
{ sku: "LP1001" },
{ name: 1, price: 1, _id: 0 }
).explain("executionStats")
Practical Examples:
1. E-commerce Search Optimization:
// Create indexes for common queries
db.products.createIndex({ category: 1, price: 1, stock: 1 })
db.products.createIndex({ "supplier.name": 1 })
db.products.createIndex({
name: "text",
description: "text",
tags: "text"
})
// Optimized search query
db.products.find({
$text: { $search: "gaming laptop" },
category: "Electronics",
price: { $lt: 2000 },
stock: { $gt: 0 }
}).sort({ price: 1 })
2. Performance Testing:
// Test different query patterns
const testQueries = async () => {
// Test 1: Basic find
db.products.find({ category: "Electronics" })
.explain("executionStats")
// Test 2: Range query
db.products.find({
price: { $gte: 500, $lte: 1500 }
}).explain("executionStats")
// Test 3: Compound conditions
db.products.find({
category: "Electronics",
"supplier.rating": { $gte: 4.5 },
stock: { $gt: 0 }
}).explain("executionStats")
}
3. Index Management:
// List all indexes
db.products.getIndexes()
// Remove unused indexes
db.products.dropIndex("index_name")
// Monitor index usage
db.products.aggregate([
{ $indexStats: {} }
])
4. Best Practices Implementation:
// Create selective compound index
db.products.createIndex(
category: 1,
price: 1,
"supplier.rating": 1
},
partialFilterExpression: {
stock: { $gt: 0 }
// Create indexes for sorting
db.products.createIndex(
{ lastUpdated: -1, name: 1 },
{ background: true }
)
Key Points to Remember:
1. Always analyze queries with explain()
2. Create indexes based on query patterns
3. Use compound indexes for multiple field queries
4. Consider partial indexes for better efficiency
5. Monitor and maintain indexes regularly
Let's explore Lab 9, which focuses on Text Search functionality in MongoDB using a catalog data
collection. Text search is powerful for implementing search features in applications.
Lab 9: Text Search with Catalog Data
First, let's create a sample catalog collection:
// Create catalog collection with sample data
db.catalog.insertMany([
productId: "PRD001",
name: "Professional Digital Camera",
description: "High-end DSLR camera with 24MP sensor and 4K video recording",
category: "Electronics",
features: [
"24MP Full-frame sensor",
"4K Video recording",
"Weather-sealed body"
],
specifications: {
brand: "CameraPro",
model: "X2000",
type: "DSLR"
},
tags: ["camera", "professional", "photography", "4K"],
price: 1499.99
// ... more products
])
Part A: Basic Text Search
1. Create Text Index:
// Create text index on multiple fields
db.catalog.createIndex({
name: "text",
description: "text",
features: "text",
tags: "text"
}, {
weights: {
name: 10,
description: 5,
features: 3,
tags: 2
},
default_language: "english"
})
2. Simple Text Search:
javascript
Copy
// Basic text search
db.catalog.find({
$text: {
$search: "digital camera"
})
// With score and sorting
db.catalog.find(
{ $text: { $search: "professional camera" } },
{ score: { $meta: "textScore" } }
).sort({ score: { $meta: "textScore" } })
Part B: Advanced Text Search
1. Phrase Search:
// Search for exact phrase
db.catalog.find({
$text: {
$search: "\"4K video recording\""
})
2. Excluding Words:
// Search with exclusion
db.catalog.find({
$text: {
$search: "camera -compact" // Exclude 'compact'
})
3. Complex Search Queries:
// Combine text search with other criteria
db.catalog.find({
$text: { $search: "professional camera" },
price: { $lt: 2000 },
category: "Electronics"
}, {
score: { $meta: "textScore" },
name: 1,
price: 1,
description: 1
}).sort({ score: { $meta: "textScore" } })
Practical Examples:
1. Product Search with Aggregation:
// Advanced search with aggregation
db.catalog.aggregate([
$match: {
$text: { $search: "professional camera" }
},
$addFields: {
searchScore: { $meta: "textScore" }
},
$match: {
searchScore: { $gt: 1.0 } // Minimum relevance threshold
},
$project: {
name: 1,
description: 1,
price: 1,
relevance: "$searchScore",
matchedTags: {
$filter: {
input: "$tags",
as: "tag",
cond: {
$regexMatch: {
input: "$$tag",
regex: "professional|camera",
options: "i"
},
$sort: { relevance: -1 }
])
2. Faceted Search:
// Search with facets
db.catalog.aggregate([
$match: {
$text: { $search: "camera" }
},
$facet: {
"categories": [
$group: {
_id: "$category",
count: { $sum: 1 }
}
],
"priceRanges": [
$bucket: {
groupBy: "$price",
boundaries: [0, 500, 1000, 2000, 5000],
default: "5000+",
output: {
count: { $sum: 1 },
items: { $push: "$name" }
],
"tags": [
{ $unwind: "$tags" },
$group: {
_id: "$tags",
count: { $sum: 1 }
},
{ $sort: { count: -1 } },
{ $limit: 10 }
])
3. Search Results Enhancement:
// Enhanced search results
db.catalog.aggregate([
{
$match: {
$text: { $search: "professional camera" }
},
$addFields: {
searchScore: { $meta: "textScore" },
highlights: {
$function: {
body: function(description, searchTerm) {
const regex = new RegExp(`(${searchTerm})`, 'gi');
return description.replace(regex, '**$1**');
},
args: ["$description", "professional camera"],
lang: "js"
},
$project: {
name: 1,
description: "$highlights",
price: 1,
relevance: "$searchScore",
features: 1,
specs: "$specifications"
},
{
$sort: { relevance: -1 }
])
Let's explore Lab 10, which focuses on creating advanced aggregation pipelines for text search in
catalog data. This lab combines text search capabilities with powerful aggregation features.
Lab 10: Aggregation Pipeline for Text Search
First, let's ensure we have a comprehensive catalog collection:
// Create catalog collection with detailed data
db.catalog.insertMany([
productId: "PRD001",
name: "Professional DSLR Camera",
description: "Advanced digital camera with full-frame sensor",
category: "Cameras",
price: 1299.99,
brand: "PhotoPro",
specifications: {
sensor: "Full-frame",
resolution: "24MP",
features: ["4K Video", "WiFi", "Bluetooth"]
},
reviews: [
{ rating: 5, text: "Excellent camera for professionals" },
{ rating: 4, text: "Great features but expensive" }
],
tags: ["photography", "professional", "digital"],
lastUpdated: new Date()
// ... more products
])
Complex Aggregation Pipelines for Text Search
1. Basic Text Search Pipeline:
db.catalog.aggregate([
// Match text search criteria
$match: {
$text: { $search: "professional camera" }
},
// Add text score
$addFields: {
score: { $meta: "textScore" }
},
// Filter by minimum score
$match: {
score: { $gt: 1.0 }
},
// Sort by relevance
$sort: {
score: -1,
price: 1
])
2. Advanced Search with Categories and Price Ranges:
db.catalog.aggregate([
// Initial text search
$match: {
$text: { $search: "camera professional" }
},
// Add search metadata
$addFields: {
relevanceScore: { $meta: "textScore" },
searchRelevance: {
$divide: [
{ $meta: "textScore" },
{ $add: [{ $strLenCP: "$description" }, 1] }
},
// Faceted results
$facet: {
"searchResults": [
{ $sort: { relevanceScore: -1 } },
{ $limit: 10 },
$project: {
name: 1,
description: 1,
price: 1,
relevanceScore: 1,
category: 1
],
"categoryBreakdown": [
$group: {
_id: "$category",
count: { $sum: 1 },
avgPrice: { $avg: "$price" }
],
"priceRanges": [
$bucket: {
groupBy: "$price",
boundaries: [0, 500, 1000, 2000, 5000],
default: "5000+",
output: {
count: { $sum: 1 },
products: { $push: "$name" }
])
3. Enhanced Search with Review Analysis:
db.catalog.aggregate([
// Text search
$match: {
$text: { $search: "professional camera" }
},
// Unwind reviews for analysis
$unwind: "$reviews"
},
// Group and analyze
$group: {
_id: "$productId",
name: { $first: "$name" },
description: { $first: "$description" },
price: { $first: "$price" },
avgRating: { $avg: "$reviews.rating" },
reviewCount: { $sum: 1 },
reviewHighlights: { $push: "$reviews.text" }
},
// Add relevance score
$addFields: {
searchScore: { $meta: "textScore" },
reviewSummary: {
$reduce: {
input: "$reviewHighlights",
initialValue: "",
in: { $concat: ["$$value", " | ", "$$this"] }
}
},
// Final projection
$project: {
name: 1,
price: 1,
relevance: "$searchScore",
ratings: {
average: "$avgRating",
count: "$reviewCount"
},
highlights: {
description: "$description",
reviews: "$reviewSummary"
},
// Sort by relevance and rating
$sort: {
relevance: -1,
"ratings.average": -1
])
4. Comprehensive Search Pipeline:
db.catalog.aggregate([
// Stage 1: Initial text search
{
$match: {
$text: { $search: "professional camera" }
},
// Stage 2: Add computed fields
$addFields: {
textScore: { $meta: "textScore" },
matchedTags: {
$filter: {
input: "$tags",
as: "tag",
cond: {
$regexMatch: {
input: "$$tag",
regex: "professional|camera",
options: "i"
},
// Stage 3: Faceted search results
$facet: {
"mainResults": [
{ $match: { textScore: { $gt: 1.0 } } },
{ $sort: { textScore: -1 } },
{ $limit: 10 },
{
$project: {
_id: 0,
productId: 1,
name: 1,
price: 1,
description: 1,
matchedTags: 1,
relevance: "$textScore"
],
"relatedProducts": [
{ $unwind: "$tags" },
{ $group: {
_id: "$tags",
products: { $push: {
id: "$productId",
name: "$name"
}},
count: { $sum: 1 }
}},
{ $sort: { count: -1 } },
{ $limit: 5 }
],
"priceAnalysis": [
$group: {
_id: null,
avgPrice: { $avg: "$price" },
minPrice: { $min: "$price" },
maxPrice: { $max: "$price" }
])