School of Computer Science and Engineering
(SCOPE)
                 Fall Semester 2024-25
             COURSE CODE: CBS3007
COURSE TITLE: Design and Analysis of Algorithms
               Digital Assignment- 1
                                  Priyanshu Kumar-21BBS0076
                                       Devansh Saxena-21BBS0178
    Customer Relationship Management: Customer Behaviour
      Prediction and Trend Analysis Using Data Analytics
Problem Description:
Customer Relationship Management (CRM) aims to improve customer satisfaction
and retention by leveraging data analytics to understand customer behavior and
preferences. The challenge is to analyze large volumes of customer data to identify
trends, predict future behaviors, and create targeted marketing strategies.
Significance of the Problem:
Understanding customer behavior is crucial for businesses to tailor their products and
services, optimize marketing strategies, and enhance customer satisfaction. Failure to
adequately analyze customer data can lead to missed opportunities, decreased
customer loyalty, and reduced revenue.
Applications:
• Retail: Personalizing marketing campaigns based on customer purchase history.
• Banking: Predicting customer churn and identifying potential cross-selling
opportunities.
• E-commerce: Recommending products based on browsing and purchasing
behaviour.
Description of the Real World Scenario of the Project:
Consider an online retail company that collects data from customer interactions,
including purchases, product reviews, and browsing behaviour. The company aims to
use this data to analyse customer trends, predict future buying behaviour, and enhance
customer satisfaction through targeted marketing campaigns.
Expected Input and Output Pattern:
• Input:
      o Customer data including demographics, purchase history, browsing patterns,
      and feedback.
      o Time series data indicating customer interactions over time.
• Output:
      o Predictive models that forecast customer behaviour.
      o Trend analysis reports identifying significant patterns in customer
      preferences.
      Algorithm Using Various Strategies:
      a) Bruteforce:
             • Description: Examine every possible combination of customer data
      points to identify patterns and trends.
             • Pseudocode:
      function bruteforcePredictor(data):
       best_prediction = None
       best_score = -inf
       for every combination of customer data:
       score = evaluate_combination(combination)
       if score > best_score:
       best_score = score
       best_prediction = combination
       return best_prediction
Explanation:
        • This algorithm evaluates every possible combination of customer data
to find the best predictor of behavior.
       • Why Not Chosen: The bruteforce approach is computationally
infeasible for large datasets due to its exponential time complexity. As the
number of customers and features increases, the number of combinations grows
exponentially.
      • Feasibility: This approach is impractical due to the vast number of
combinations, leading to exponential complexity.
b. Backtracking:
      • Description: Build potential models by adding data points iteratively
and backtrack when the model fails to explain the data.
      • Pseudocode:
function backtrackPredictor(data, current_combination):
if is_solution(current_combination):
record_solution(current_combination)
for each option in available_options:
add option to current_combination
backtrackPredictor(data, current_combination)
remove option from current_combination
Explanation:
• This algorithm incrementally builds candidate solutions and abandons them if
they fail to meet the criteria.
• Why Not Chosen: Backtracking can be slow and may still require significant
time
for large datasets. It’s also complicated to implement for predictive modeling.
• Feasibility: Backtracking can be slow and may not scale well for large
datasets.
c. Branch and Bound:
• Description: Systematically explore branches of potential models while
pruning those that do not meet certain criteria (e.g., minimum accuracy).
• Pseudocode:
function branchAndBoundPredictor(data):
initialize priority queue
add initial state to queue
while queue is not empty:
current_state = remove state with highest priority
if is_solution(current_state):
record_solution(current_state)
for each neighbor of current_state:
if is_better_than_best(neighbor):
add neighbor to queue
Explanation:
• This algorithm systematically explores branches of possible solutions and
eliminates branches that cannot yield better results.
• Why Not Chosen: While more efficient than backtracking, it may still be
impractical for large datasets and can be complex to implement for the
prediction of customer behaviors.
• Feasibility: More efficient than backtracking but still not ideal for complex
datasets with many features.
d. Dynamic Programming (Chosen Strategy):
• Description: Use dynamic programming to break down the problem into
smaller, manageable subproblems, allowing for efficient analysis and
prediction of customer behavior. For instance, store intermediate results of
customer trend analysis to avoid recalculating them.
• Pseudocode:
function dynamicProgrammingPredictor(data):
initialize dp_table with size [num_customers][num_features]
for each customer in data:
for each feature in customer_features:
calculate_value(dp_table, customer, feature)
best_solution = find_best_solution(dp_table)
return best_solution
Explanation:
• This approach breaks the problem into smaller subproblems and stores the
results for efficient reuse.
• Why Chosen: Dynamic programming is efficient for large datasets, as it
reduces redundant calculations. It allows for effective analysis of customer
behavior patterns and trends.
• Feasibility: Dynamic programming is highly efficient for large datasets,
optimizing both space and time complexity.
Discuss the Algorithm of Chosen Strategy & Its Complexity:
Dynamic Programming Approach:
       • Steps:
              1. Data Preprocessing: Clean and format customer data,
              transforming it into a suitable structure for analysis.
              2. Feature Engineering: Create features that represent significant
              aspects of customer behavior, such as average purchase
              frequency, time between purchases, and customer lifetime value.
              3. DP Table Definition: Define a DP table
              dp[customers][features] to store intermediate results for customer
              behavior patterns.
              4. Trend Analysis: Populate the DP table by iterating through
              customer data and aggregating insights into trends.
              5. Model Training: Use the insights from the DP table to train
              predictive models using machine learning techniques (e.g.,
              regression, decision trees).
              6. Prediction and Reporting: Make predictions based on the
              trained models and generate reports outlining customer trends
              and behaviors.
Time Complexity:
• O(n * m), where n is the number of customers and m is the number of
features.
This is manageable compared to the bruteforce approach.
Space Complexity:
• O(n * m) for storing the DP table.
Example Code:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
data = {
'customer_id': [1, 2, 3, 4, 5, 6, 7],
'age': [25, 35, 45, 30, 40, 45, 55],
'purchase_frequency': [5, 15, 25, 10, 20, 35, 65],
'avg_spent': [100, 150, 200, 120, 180, 130, 250],
'churned': [0, 1, 0, 0, 1,0 ,1]
df = pd.DataFrame(data)
X = df[['age', 'purchase_frequency', 'avg_spent']]
y = df['churned']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy * 100:.2f}%")
new_customer = pd.DataFrame({'age': [28], 'purchase_frequency': [12],
'avg_spent':
[130]})
prediction = clf.predict(new_customer)
if prediction == 1:
print("The customer is likely to churn.")
else:
print("The customer is likely to be retained.")
Output:
How the Chosen Algorithm Works:
      1. Data Preparation: A sample dataset is created, consisting of customer
      IDs, ages, purchase frequencies, average spending, and churn status.
      This data is converted into a DataFrame for easier manipulation.
      2. Feature Selection: The relevant features (age, purchase_frequency,
      avg_spent) are selected for training the model.
      3. Data Splitting: The data is split into training and testing sets using an
      80/20 ratio.
      4. Model Training: A Random Forest Classifier is initialized and trained
      on the training data. This algorithm is robust for classification tasks and
      handles various data types effectively.
      5. Prediction: The model predicts churn for the test dataset.
      The accuracy of the model is evaluated using the accuracy score metric.
      6. New Customer Prediction: The model is used to predict whether a
      new customer (with given attributes) is likely to churn based on their
      data.
Example Execution:
1. Sample Input:
o New Customer: Age = 28, Purchase Frequency = 12, Average Spent = 130.
2. Runtime Output:
o The program outputs the model's accuracy (e.g., "Model Accuracy:
50.00%").
o It will also indicate if the new customer is likely to churn or be retained
(e.g., "The customer is likely to be retained.").
Observations:
• Dynamic programming provides an efficient method for analyzing customer
behavior by storing intermediate results.
• Predictive models derived from well-structured data can significantly improve
customer relationship strategies.
• By identifying trends and predicting behavior, businesses can tailor their
marketing efforts, leading to increased customer satisfaction and loyalty.
References:
• Kotler, Philip, and Keller, Kevin Lane. Marketing Management. Pearson
Education,
2016.
• F. Chen, et al. "Customer Relationship Management: A Data-Driven
Approach".
Journal of Marketing, 2017