PYTHON Tutorial

Common ML Algorithms

Linear Regression:

  • Goal: Predict a continuous value using a linear relationship.
  • Steps:
    • Collect data on independent and dependent variables.
    • Fit a line to the data using least squares regression.
    • Use the line to predict future values.

Logistic Regression:

  • Goal: Predict a binary outcome (0 or 1) using a logistic function.
  • Steps:
    • Collect data on independent variables and binary outcomes.
    • Fit a logistic function to the data using maximum likelihood estimation.
    • Use the function to predict future outcomes.

Decision Trees:

  • Goal: Classify data into multiple categories using a tree-like structure.
  • Steps:
    • Collect data on attributes and categories.
    • Build a tree where each node represents an attribute and each branch represents a decision.
    • Classify new data by traversing the tree.

Random Forests:

  • Goal: Improve accuracy by combining multiple decision trees.
  • Steps:
    • Create multiple decision trees using different subsets of data.
    • Combine the predictions of the trees by majority vote or averaging.

Support Vector Machines (SVM):

  • Goal: Classify data into two or more categories by finding the best hyperplane that separates the data.
  • Steps:
    • Collect data on attributes and categories.
    • Find the hyperplane that maximizes the margin between the categories.
    • Classify new data points by their position relative to the hyperplane.

K-Nearest Neighbors (KNN):

  • Goal: Classify or predict data by finding the most similar examples in the training data.
  • Steps:
    • Collect data on attributes and categories or values.
    • For a new data point, find the K nearest neighbors in the training data.
    • Classify the new data point based on the majority vote or weighted average of the neighbors.

Python Example:

Overview of Commonly Used Machine Learning Algorithms

import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier

data = pd.read_csv('data.csv')

# Linear Regression
model = LinearRegression()
model.fit(data[['x']], data['y'])

# Logistic Regression
model = LogisticRegression()
model.fit(data[['x']], data['y'])

# Decision Tree
model = DecisionTreeClassifier()
model.fit(data[['x']], data['y'])

# Random Forest
model = RandomForestClassifier()
model.fit(data[['x']], data['y'])

# SVM
model = SVC()
model.fit(data[['x']], data['y'])

# KNN
model = KNeighborsClassifier()
model.fit(data[['x']], data['y'])