Monday, 9 February 2026

Minor Project Ideas for B.tech Computer Science

B.Tech Minor Project: Data Science using Regression

This project uses Linear Regression for continuous prediction and Logistic Regression for binary classification problems such as pass/fail and disease prediction.

   MINOR PROJECT IDEAS 


1. Heart Disease Prediction

Problem:
Classify if a person has heart disease.

Inputs:

  • Age

  • BP

  • Cholesterol

  • Heart rate

Output:

  • Disease / No disease

    Heart Disease Prediction

    Problem:
    Classify if a person has heart disease.

    Inputs:

    • Age

    • BP

    • Cholesterol

    • Heart rate

    Output:

    • Disease / No disease




2. 

2. College Placement Probability Prediction System

(Highly impressive for viva)

Problem

Predict whether a student will get placed + expected salary range.

Models

  • Logistic Regression → placed / not placed

  • Linear Regression → expected package

Features

  • CGPA trend

  • Coding test scores

  • Internship experience

  • Soft-skill ratings

Extra Credit

  • Feature importance analysis

  • Probability confidence score


3.

3. Financial Credit Risk & EMI Recommendation System

(Industry-oriented)

Problem

Predict:

  • Loan approval probability

  • Safe EMI amount

Models

  • Logistic Regression → loan approval

  • Linear Regression → EMI amount

Features

  • Income trend

  • Expenses

  • Credit behavior

  • Loan tenure

BIG Add-On

  • Risk tiers (Low / Medium / High)


4. 

4. Smart Energy Consumption Forecasting System

(Engineering + Sustainability)

Problem

Forecast electricity consumption & detect over-usage risk.

Models

  • Linear Regression → energy units

  • Logistic Regression → overload risk

Features

  • Appliance usage

  • Seasonal effects

  • Household size

Outputs

  • Monthly forecast

  • Warning alerts


5. Smart Traffic Congestion & Accident Risk System

(Engineering + AI)

Problem

Predict:

  • Traffic congestion level

  • Accident probability

Models

  • Linear Regression → congestion index

  • Logistic Regression → accident risk

Features

  • Vehicle count

  • Time of day

  • Weather


6. Social Media Misinformation Risk Analyzer

(Trending & Research-oriented)

Problem

Predict whether content is misleading.

Models

  • Logistic Regression → fake / real

  • Linear Regression → virality score

Features

  • Engagement metrics

  • Posting time

  • Account credibility

BIG VALUE

  • Explainable coefficients

  • Ethical AI discussion


7. CROP DISEASE PREDICTION SYSTEM

(Data Science Project using Logistic Regression)

🔹 . Problem Statement

Early detection of crop diseases is critical to reduce yield loss and improve agricultural productivity.
This project aims to predict whether a crop is diseased or healthy based on environmental and crop-related parameters using Logistic Regression.


🔹 . Why this project is “BIG & GOOD”

  • Real-world agricultural problem

  • Social + economic impact

  • Explainable ML (important for farmers)

  • Can be extended to yield loss prediction

  • Faculty-friendly & industry-relevant


🔹 . Project Objectives

  • Predict disease presence (Yes/No)

  • Analyze factors causing disease

  • Provide early warning

  • (Optional) Predict severity or yield loss


🔹 . Dataset (Non-image, Data Science based)

Input Features (examples)

  • Temperature (°C)

  • Humidity (%)

  • Rainfall (mm)

  • Soil moisture

  • Soil pH

  • Crop type

  • Season

  • Fertilizer usage

  • Pesticide usage

Output

  • Disease (0 = Healthy, 1 = Diseased)

📌 Datasets:

  • Kaggle: Crop Disease / Agriculture datasets

  • Government agriculture data

  • Synthetic dataset (acceptable for minor project)


🔹 . Machine Learning Models Used

✅ Logistic Regression (Main Model)

Used because:

  • Output is binary

  • Easy to interpret coefficients

  • Works well with tabular data

Equation:

P(Disease)=11+e(β0+β1x1+...+βnxn)P(\text{Disease}) = \frac{1}{1 + e^{-(\beta_0 + \beta_1x_1 + ... + \beta_nx_n)}}

(Optional) Linear Regression

  • Predict severity level

  • Predict expected yield loss


🔹 . System Architecture

  1. Data Collection

  2. Data Preprocessing

  3. Feature Selection

  4. Logistic Regression Model

  5. Prediction

  6. Result Visualization

  7. Recommendation System


🔹 . Implementation Flow (Python)

  • Load dataset

  • Handle missing values

  • Train-test split

  • Train Logistic Regression model

  • Evaluate using:

    • Accuracy

    • Confusion Matrix

    • Precision, Recall

  • Plot:

    • Probability curve

    • Feature importance


🔹 . Results to Show (VERY IMPORTANT)

  • Disease prediction accuracy

  • Confusion matrix

  • Probability vs threshold graph

  • Feature impact analysis

  • Sample predictions


🔹 . Future Scope (Makes project BIG)

  • Image-based disease detection (CNN)

  • IoT sensor integration

  • Mobile app for farmers

  • Real-time weather API

  • Crop recommendation system


10.Intelligent Crop Yield Forecasting System

(Different from disease prediction)

Problem:
Predict crop yield before harvest.

Models:

  • Linear Regression → yield (tons/hectare)

  • Logistic Regression → low / normal yield risk

Features:
Rainfall, soil nutrients, fertilizer, season


11.  Air Pollution Level Prediction & Health Risk Alert

Problem:
Predict AQI and classify health risk.

Models:

  • Linear Regression → AQI value

  • Logistic Regression → hazardous / safe

Features:
PM2.5, PM10, NO₂, SO₂, temperature


 


Monday, 2 February 2026

DATA SCIENCE LAB MANUAL

 

 Example 1: Linear Regression

PREDITING PRICE BASES ON AREA

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression

# 1. Example Data
# X = Input (Area of house)
# Y = Output (Price)
X = np.array([100, 200, 300, 400, 500]).reshape(-1, 1)
Y = np.array([10, 20, 30, 40, 50])

# 2. Create and train the model
model = LinearRegression()   # intercept is included by default
model.fit(X, Y)

# 3. Get slope and intercept
m = model.coef_[0]        # slope
b = model.intercept_      # intercept

# 4. Prediction
x_new = 350
y_pred = model.predict([[x_new]])

# 5. Plot the graph
X_line = np.linspace(100, 500, 100).reshape(-1, 1)
Y_line = model.predict(X_line)

plt.scatter(X, Y)          # original data points
plt.plot(X_line, Y_line)   # regression line
plt.scatter(x_new, y_pred) # predicted point
plt.xlabel("Area")
plt.ylabel("Price")
plt.title("Simple Linear Regression")
plt.show()

# 6. Print results
print("Slope (m):", m)
print("Intercept (b):", b)
print("Equation: Y =", m, "* X +", b)
print("Predicted Price for", x_new, ":", y_pred[0])








 Example 2: Linear Regression (Study Hours vs Marks)

👉 Problem Statement

Predict student marks based on study hours.



import numpy as np

import matplotlib.pyplot as plt

from sklearn.linear_model import LinearRegression


# Data

X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)  # Study hours

Y = np.array([35, 40, 50, 60, 70])            # Marks


# Model

model = LinearRegression()

model.fit(X, Y)


# Slope and intercept

m = model.coef_[0]

b = model.intercept_


# Prediction

hours = 6

predicted_marks = model.predict([[hours]])


# Plot

X_line = np.linspace(1, 6, 100).reshape(-1, 1)

Y_line = model.predict(X_line)


plt.scatter(X, Y)

plt.plot(X_line, Y_line)

plt.scatter(hours, predicted_marks)

plt.xlabel("Study Hours")

plt.ylabel("Marks")

plt.title("Linear Regression: Study Hours vs Marks")

plt.show()


# Output

print("Slope (m):", m)

print("Intercept (b):", b)

print("Equation: Y =", m, "* X +", b)

print("Predicted Marks for", hours, "hours:", predicted_marks[0])






2. Logistic Regression code  

Program 1: Student Pass / Fail Prediction



import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix

# Dataset
data = {
    'Hours_Studied': [1,2,3,4,5,6,7,8],
    'Passed': [0,0,0,0,1,1,1,1]
}

df = pd.DataFrame(data)

# Features and target
X = df[['Hours_Studied']]
y = df['Passed']

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.25, random_state=0
)

# Model
model = LogisticRegression()
model.fit(X_train, y_train)

# Prediction
y_pred = model.predict(X_test)

# Results
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("Intercept (β0):", model.intercept_)
print("Coefficient (β1):", model.coef_)  

OUTPUT 
Accuracy: 1.0 Confusion Matrix: [[1 0] [0 1]] Intercept (β0): [-4.4350521] Coefficient (β1): [[1.01750279]]

Program 2: Salary > 50K Prediction (Binary Classification)

# Logistic Regression – Example 2 (Salary Prediction) import pandas as pd from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score # Dataset data = { 'Age': [22,25,30,35,40,45,50,55], 'Experience': [0,1,3,5,7,10,15,20], 'High_Salary': [0,0,0,1,1,1,1,1] } df = pd.DataFrame(data) # Features and target X = df[['Age', 'Experience']] y = df['High_Salary'] # Train-test split X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.3, random_state=1 ) # Model model = LogisticRegression() model.fit(X_train, y_train) # Prediction y_pred = model.predict(X_test) # Results print("Accuracy:", accuracy_score(y_test, y_pred)) print("Intercept (β0):", model.intercept_) print("Coefficients (β1, β2):", model.coef_)

Accuracy: 0.6666666666666666 Intercept (β0): [-13.33988536] Coefficients (β1, β2): [[0.45410297 0.17502853]]

Monday, 12 January 2026

Blockchain technology

 

Blockchain technology is a decentralized, distributed digital ledger that securely records transactions across many computers, creating an immutable (unchangeable) and transparent record of data, like ownership or transactions, without needing a central authority like a bank. Data is grouped into "blocks," cryptographically linked in a chronological "chain," and validated by network consensus, making it highly secure and resistant to tampering. 
How it works:
  • Blocks & Chains: Transactions are bundled into blocks. When a block fills, it's sealed with a unique digital fingerprint (cryptographic hash) and linked to the previous block, forming a chain.
  • Decentralization: Instead of one central server, copies of the ledger are held by many computers (nodes) on the network, removing single points of failure.
  • Consensus: Before a new block is added, most nodes must agree on its validity through a consensus mechanism, ensuring trust.
  • Immutability: Once a block is added, altering its data would require changing all subsequent blocks across the majority of the network, which is computationally infeasible. 
  • Key Features:
  • Transparency: All participants can view transactions.
  • Security: Cryptography and decentralization make it tamper-proof.
  • No Intermediaries: Reduces reliance on third parties, lowering costs and increasing efficiency. 
Uses:

  • Cryptocurrencies: Like Bitcoin, to track digital money.
  • Supply Chain: To track goods transparently.
  • Smart Contracts: Self-executing contracts with terms directly written into code.
  • Digital Identity & Voting: Securing personal data and votes. 


Security in blockchain technology involves addressing vulnerabilities at multiple layers, from the underlying network infrastructure to the application-specific smart contract code. Key issues include consensus manipulation, smart contract logic flaws, and external operational risks. 
Blockchain-Related Issues
These vulnerabilities target the foundational infrastructure and protocols of the blockchain network. 
51% Attacks: An attacker or group gains control of more than 50% of the network's total computing power (hash power in Proof of Work) or staked tokens (in Proof of Stake), allowing them to manipulate transaction ordering, prevent new transactions from being confirmed, and perform "double-spending" of coins.
  • Consensus Mechanism Attacks: Issues in consensus algorithms (e.g., selfish mining, block withholding) can be exploited to gain a disproportionate amount of block rewards or disrupt network operations.
  • Network Attacks: The network layer is vulnerable to standard network threats like Distributed Denial of Service (DDoS) attacks, Sybil attacks (creating multiple fake identities), and routing attacks that can intercept or modify data transmission between nodes.
  • Private Key Security: A primary operational risk is the compromise or mismanagement of users' private keys, which control access to digital assets. This often occurs through phishing or insecure storage, as transactions are irreversible once signed with a private key. 
Higher-Level Language (Solidity) Related Issues 
These issues stem from vulnerabilities and anti-patterns in the smart contract programming language itself, such as Solidity. 
  • Reentrancy: A malicious contract repeatedly calls a vulnerable contract's function before the first execution is complete, often to drain funds. The "Checks-Effects-Interactions" pattern is the standard mitigation.
  • Arithmetic Overflows/Underflows: Fixed-size integer types in Solidity (before version 0.8.0, which added automatic checks) can lead to values wrapping around their maximum or minimum limits, resulting in incorrect calculations and potential fund manipulation.
  • Unauthorized Access: Missing or improperly implemented access control modifiers (like onlyOwner) can allow unauthorized users to execute critical functions.
  • Insecure Randomness: Smart contracts rely on deterministic processes, making true randomness difficult to achieve on-chain. Using predictable variables like block.timestamp or blockhash for random numbers can be exploited by miners or attackers.
  • Front-Running: Attackers monitor the mempool (where unconfirmed transactions wait) for profitable transactions and submit their own transaction with a higher gas fee to have it processed first, gaining an unfair advantage. 
EVM Bytecode Related Issues
Vulnerabilities at the Ethereum Virtual Machine (EVM) bytecode level are related to how the compiled code executes. 
  • Stack Overflows/Underflows: The EVM uses a stack for computation; if the stack limits are exceeded due to recursive calls or complex logic, the contract can crash or behave unexpectedly.
  • Call Stack Depth Limit: The EVM has a call stack depth limit. Maliciously crafted contracts can exploit this to cause a denial-of-service condition for other contracts.
  • Short Address Attack: An attacker can manipulate transaction input data to exploit insufficient address length checks, potentially redirecting funds to their own address or causing other errors.
  • Unchecked External Calls: If a smart contract calls another contract without verifying the return value, a failed external call may not revert the entire transaction, leading to inconsistent state in the calling contract. 
Real-Life Attacks on Blockchain Applications/Smart Contracts 
  • The DAO Hack (2016): An attacker exploited a reentrancy vulnerability in the smart contract to drain approximately $60 million worth of Ether. This event led to a contentious hard fork, splitting Ethereum into Ethereum (ETH) and Ethereum Classic (ETC).
  • Parity Wallet Hack (2017): A bug in a multi-signature wallet library left two critical functions public, allowing an attacker to claim ownership of the contract and freeze about $150 million worth of Ether indefinitely.
  • Ronin Bridge Hack (2022): Attackers compromised the private keys of five out of nine validator nodes for the Ronin Network's cross-chain bridge, enabling them to forge withdrawal transactions and steal over $600 million in one of the largest DeFi hacks to date.
  • Mt. Gox Exchange Collapse (2014): This was an attack on a centralized exchange, but it highlighted the vulnerability of centralized entities within the broader crypto ecosystem. Attackers exploited a Bitcoin transaction malleability vulnerability, leading to the loss of millions in user funds and the exchange's bankruptcy. 
Trusted Execution Environments (TEEs) 
Trusted Execution Environments (TEEs), such as Intel SGX, are a potential solution for enhancing blockchain security and privacy. 
  • Function: TEEs create a secure, isolated environment (enclave) within a processor where data and code can be executed with integrity and confidentiality, even if the host operating system or network is compromised.
  • Benefits for Blockchain:
    • Privacy: TEEs can process private or sensitive data off-chain within the enclave without revealing the raw data to the public ledger or even the node operator, thus enhancing data privacy.
    • Security: By isolating smart contract execution, TEEs can protect against certain host-level attacks and ensure the integrity of the computation, even if a node is malicious.
    • Scalability: Offloading complex computations to TEEs can potentially reduce the load on the main blockchain, improving scalability and transaction speeds.
  • Limitations and Issues: TEEs themselves can have vulnerabilities (e.g., side-channel attacks). Furthermore, their use introduces a degree of reliance on hardware manufacturers (centralization risk) and increases the overall complexity of the system design. 


Blockchain security encompasses vulnerabilities at protocol, language (Solidity), and runtime (EVM) levels, alongside real-world exploits and hardware mitigations like Trusted Execution Environments (TEEs). These issues arise due to blockchain's immutability, public nature, and economic incentives for attacks. Detailed notes follow with examples from common lectures on smart contract security.

Blockchain Issues

Blockchain platforms face consensus, scalability, and economic attacks.

  • 51% Attacks: Miner majority rewrites history; example—Ethereum Classic (ETC) lost $1.1M in 2019 as attackers double-spent via majority hash power.

  • Sybil Attacks: Flooding network with fake nodes; countered by Proof-of-Stake (PoS) in Ethereum 2.0.

  • Eclipse Attacks: Isolating nodes to manipulate views; risks private chain forks.

Solidity Issues

Solidity, Ethereum's primary language, introduces high-level pitfalls exploitable due to its Turing-completeness and lack of safe defaults.

  • Reentrancy: External calls before state updates allow recursive calls; DAO hack (2016) drained $60M by reentering withdrawal function.

  • Integer Overflow/Underflow: Pre-0.8.0, unchecked math wraps values; example—attackers mint extra tokens via uint overflow in bad ERC-20.

  • Access Control Flaws: Missing modifiers like onlyOwner; unprotected functions let anyone self-destruct contracts.

  • Timestamp Dependenceblock.timestamp miner-manipulable; used in gambling contracts for predictable "randomness."

  • Front-Running: Mempool scanning to bid higher gas; DEX arbitrage bots frontrun trades for profit.

EVM Bytecode Issues

EVM executes low-level bytecode, exposing gas mechanics and opcodes to abuse.

  • Gas Limit DoS: Loops over unbounded arrays exhaust block gas (30M on Ethereum); attackers submit transactions failing late.

  • Unchecked Callscall() returns false on failure but doesn't revert; leads to silent fund sends.

  • Delegatecall Risks: Context swaps enable storage overwrites; Parity Wallet (2017) lost $30M via delegatecall bug initializing wrong library.

  • Opcode LimitsSTATICCALL post-Constantinople prevents state changes in view functions.

Real-Life Attacks

Historical exploits highlight patterns.

  • The DAO (2016): Reentrancy; led to Ethereum hard fork.

  • Parity Multi-Sig (2017): Self-destruct vulnerability; $280K lost.

  • BeautyChain (BEC, 2018): Infinite mint via balances[msg.sender] += totalSupply overflow.

  • bZx Flash Loans (2020): Price oracle manipulation via repeated trades in one tx.

  • Cream Finance (2021): Flash loan + oracle exploit drained $130M.

AttackCauseLossFix
DAOReentrancy$60MChecks-Effects-Interactions
ParityDelegatecall$280KInitialization guards
BECOverflowMillionsSafeMath/OpenZeppelin

Trusted Execution Environments

TEEs provide hardware-based isolation for off-chain computation, enhancing blockchain privacy/scalability.

  • SGX (Intel): Enclaves shield code/data from host OS; Secret Network uses for private smart contracts.

  • Issues: Side-channels (Spectre/Meltdown), attestation trust; example—Malware in SGX enclave compromised keys.

  • Use Cases: Oracle feeds (e.g., Phala Network), ZK proofs generation without revealing inputs.

  • Alternatives: AWS Nitro Enclaves, ARM TrustZone for mobile blockchain apps.


Minor Project Ideas for B.tech Computer Science

B.Tech Minor Project: Data Science using Regression This project uses Linear Regression for continuous prediction and Logistic Regress...