[Stanford Univ: CS231n] Spring 2025 Assignment3. Q2(Self-Supervised Learning for Image Classification)

Notice

Recent Posts

Recent Comments

Link

250x250

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

newhaneul

[Stanford Univ: CS231n] Spring 2025 Assignment3. Q2(Self-Supervised Learning for Image Classification) 본문

2. Artificial Intelligence/Stanford Univ. CS231n

[Stanford Univ: CS231n] Spring 2025 Assignment3. Q2(Self-Supervised Learning for Image Classification)

뉴하늘 2025. 6. 4. 14:30

728x90

본 포스팅은 Stanford University School of Engineering의 CS231n: Convolutional Neural Networks for Visual Recognition을 수강하고 공부한 내용을 정리하기 위한 포스팅입니다.

https://github.com/cs231n/cs231n.github.io/blob/master/assignments/2025/assignment3.md

cs231n.github.io/assignments/2025/assignment3.md at master · cs231n/cs231n.github.io

Public facing notes page. Contribute to cs231n/cs231n.github.io development by creating an account on GitHub.

github.com

https://github.com/KwonKiHyeok/CS231n/tree/main

GitHub - KwonKiHyeok/CS231n: This repository contains my solutions to the assignments of the CS231n course offered by Stanford U

This repository contains my solutions to the assignments of the CS231n course offered by Stanford University (Spring 2025). - KwonKiHyeok/CS231n

github.com

Self-Supervised Learning

SimCLR

SimCLR(Simple Framework for Contrastive Learning of Visual Representations)은 self-supervised learning의 대표적인 프레임워크로, 레이블 없이 이미지 표현을 학습하는 방법을 제안한 모델이다. 특히, contrastive learning(대조 학습) 기법을 통해 이미지의 의미론적 유사성을 반영하는 임베딩을 학습한다.

핵심 아이디어는 데이터에 Augmentation을 적용한 후, 같은 이미지의 다른 augment 버전은 서로 가깝게, 다른 이미지의 augment는 멀게 만드는 방식으로 학습을 진행한다.

SimCLR: Data Augmentation

def compute_train_transform(seed=123456):
    """
    This function returns a composition of data augmentations to a single training image.
    Complete the following lines. Hint: look at available functions in torchvision.transforms
    """
    random.seed(seed)
    torch.random.manual_seed(seed)
    
    # Transformation that applies color jitter with brightness=0.4, contrast=0.4, saturation=0.4, and hue=0.1
    color_jitter = transforms.ColorJitter(0.4, 0.4, 0.4, 0.1)  
    
    train_transform = transforms.Compose([
        ##############################################################################
        # TODO: Start of your code.                                                  #
        #                                                                            #
        # Hint: Check out transformation functions defined in torchvision.transforms #
        # The first operation is filled out for you as an example.
        ##############################################################################
        # Step 1: Randomly resize and crop to 32x32.
        transforms.RandomResizedCrop(32),
        # Step 2: Horizontally flip the image with probability 0.5
        transforms.RandomHorizontalFlip(p = 0.5),
        # Step 3: With a probability of 0.8, apply color jitter (you can use "color_jitter" defined above.
        transforms.RandomApply([color_jitter], p = 0.8),
        # Step 4: With a probability of 0.2, convert the image to grayscale
        transforms.RandomGrayscale(p = 0.2),
        ##############################################################################
        #                               END OF YOUR CODE                             #
        ##############################################################################
        transforms.ToTensor(),
        transforms.Normalize([0.4914, 0.4822, 0.4465], [0.2023, 0.1994, 0.2010])])
    return train_transform

class CIFAR10Pair(CIFAR10):
    """CIFAR10 Dataset.
    """
    def __getitem__(self, index):
        img, target = self.data[index], self.targets[index]
        img = Image.fromarray(img)

        x_i = None
        x_j = None

        if self.transform is not None:
            ##############################################################################
            # TODO: Start of your code.                                                  #
            #                                                                            #
            # Apply self.transform to the image to produce x_i and x_j in the paper      #
            ##############################################################################
            x_i = self.transform(img)
            x_j = self.transform(img)
            ##############################################################################
            #                               END OF YOUR CODE                             #
            ##############################################################################

        if self.target_transform is not None:
            target = self.target_transform(target)

        return x_i, x_j, target

SimCLR: Base Encoder and Projection Head

class Model(nn.Module):
    def __init__(self, feature_dim=128):
        super(Model, self).__init__()

        self.f = []
        for name, module in resnet50().named_children():
            if name == 'conv1':
                module = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)
            if not isinstance(module, nn.Linear) and not isinstance(module, nn.MaxPool2d):
                self.f.append(module)
        # encoder
        self.f = nn.Sequential(*self.f)
        # projection head
        self.g = nn.Sequential(nn.Linear(2048, 512, bias=False), nn.BatchNorm1d(512),
                               nn.ReLU(inplace=True), nn.Linear(512, feature_dim, bias=True))

    def forward(self, x):
        x = self.f(x)
        feature = torch.flatten(x, start_dim=1)
        out = self.g(feature)
        return F.normalize(feature, dim=-1), F.normalize(out, dim=-1)

SimCLR: Contrastive Loss: Vanila

def sim(z_i, z_j):
    """Normalized dot product between two vectors.

    Inputs:
    - z_i: 1xD tensor.
    - z_j: 1xD tensor.
    
    Returns:
    - A scalar value that is the normalized dot product between z_i and z_j.
    """
    norm_dot_product = None
    ##############################################################################
    # TODO: Start of your code.                                                  #
    #                                                                            #
    # HINT: torch.linalg.norm might be helpful.                                  #
    ##############################################################################
    norm_dot_product = torch.dot(z_i, z_j)
    norm_dot_product /= torch.linalg.norm(z_i) * torch.linalg.norm(z_j)
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################
    
    return norm_dot_product

norm_dot_product = torch.dot(z_i, z_j)

두 벡터 간의 내적(dot product) 를 계산

norm_dot_product /= torch.linalg.norm(z_i) * torch.linalg.norm(z_j)

각각의 벡터에 대해 L2 노름을 계산하고,
그 곱으로 나눠 정규화된 유사도 (코사인 유사도)를 얻는다.

def simclr_loss_naive(out_left, out_right, tau):
    """Compute the contrastive loss L over a batch (naive loop version).
    
    Input:
    - out_left: NxD tensor; output of the projection head g(), left branch in SimCLR model.
    - out_right: NxD tensor; output of the projection head g(), right branch in SimCLR model.
    Each row is a z-vector for an augmented sample in the batch. The same row in out_left and out_right form a positive pair. 
    In other words, (out_left[k], out_right[k]) form a positive pair for all k=0...N-1.
    - tau: scalar value, temperature parameter that determines how fast the exponential increases.
    
    Returns:
    - A scalar value; the total loss across all positive pairs in the batch. See notebook for definition.
    """
    N = out_left.shape[0]  # total number of training examples
    
     # Concatenate out_left and out_right into a 2*N x D tensor.
    out = torch.cat([out_left, out_right], dim=0)  # [2*N, D]
    
    total_loss = 0
    for k in range(N):  # loop through each positive pair (k, k+N)
        z_k, z_k_N = out[k], out[k+N]
        
        ##############################################################################
        # TODO: Start of your code.                                                  #
        #                                                                            #
        # Hint: Compute l(k, k+N) and l(k+N, k).                                     #
        ##############################################################################
        exp = torch.exp(sim(z_k, z_k_N) / tau)
        total_k = 0
        total_k_N = 0

        for j in range(2*N):
          if k != j: # l(k, k+N)
            total_k += torch.exp(sim(z_k, out[j]) / tau)
          if k + N != j: # l(k+N, k)
            total_k_N += torch.exp(sim(z_k_N, out[j]) / tau)

        total_loss += -(torch.log(exp / total_k) + torch.log(exp / total_k_N))
        ##############################################################################
        #                               END OF YOUR CODE                             #
        ##############################################################################
    
    # In the end, we need to divide the total loss by 2N, the number of samples in the batch.
    total_loss = total_loss / (2*N)
    return total_loss

exp = torch.exp(sim(z_k, z_k_N) / tau)

분자: positive pair 간 유사도 분모: anchor z_k 또는 z_{k+N}에 대해, 자기 자신을 제외한 모든 예시와의 유사도 합

total_loss += -(torch.log(exp / total_k) + torch.log(exp / total_k_N))
total_loss = total_loss / (2*N)

total loss에 l(k, k + N), l(k + N, k) 모두 더함
전체 positive pair가 2N개 있으므로 평균

SimCLR: Contrastive Loss: Vectorized

def sim_positive_pairs(out_left, out_right):
    """Normalized dot product between positive pairs.

    Inputs:
    - out_left: NxD tensor; output of the projection head g(), left branch in SimCLR model.
    - out_right: NxD tensor; output of the projection head g(), right branch in SimCLR model.
    Each row is a z-vector for an augmented sample in the batch.
    The same row in out_left and out_right form a positive pair.
    
    Returns:
    - A Nx1 tensor; each row k is the normalized dot product between out_left[k] and out_right[k].
    """
    pos_pairs = None
    
    ##############################################################################
    # TODO: Start of your code.                                                  #
    #                                                                            #
    # HINT: torch.linalg.norm might be helpful.                                  #
    ##############################################################################
    # Compute dot product for each pair (row-wise)
    numerator = torch.sum(out_left * out_right, dim = 1, keepdim = True)
    # Compute L2 norms
    norm_left = torch.linalg.norm(out_left, dim = 1, keepdim = True)
    norm_right = torch.linalg.norm(out_right, dim = 1, keepdim = True)
    # Compute normalized dot product (cosine similarity)
    pos_pairs = numerator / (norm_left * norm_right)
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################
    return pos_pairs

def compute_sim_matrix(out):
    """Compute a 2N x 2N matrix of normalized dot products between all pairs of augmented examples in a batch.

    Inputs:
    - out: 2N x D tensor; each row is the z-vector (output of projection head) of a single augmented example.
    There are a total of 2N augmented examples in the batch.
    
    Returns:
    - sim_matrix: 2N x 2N tensor; each element i, j in the matrix is the normalized dot product between out[i] and out[j].
    """
    sim_matrix = None
    
    ##############################################################################
    # TODO: Start of your code.                                                  #
    ##############################################################################
    # 1. L2 norm: shape (2N, 1)
    norm = torch.linalg.norm(out, dim = 1, keepdim = True) 
    # 2. Normalize each row: shape (2N, D)
    normalized_out = out / norm 
    # 3. Compute similarity via matrix multiplication
    sim_matrix = normalized_out @ normalized_out.T
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################
    return sim_matrix

1. 정규화(normalize):

각 벡터 z_i를 L2 norm으로 나눠 유닛 벡터로 만들면,
dot product 자체가 바로 코사인 유사도가 된다.
즉, sim(z_i, z_j) = normalized_i ⋅ normalized_j

2. 벡터 정규화 (broadcast 사용):

norm = torch.linalg.norm(out, dim=1, keepdim=True)  # shape: (2N, 1)
normalized_out = out / norm  # shape: (2N, D)

sim_matrix = normalized_out @ normalized_out.T  # shape: (2N, 2N)

def simclr_loss_vectorized(out_left, out_right, tau, device='cuda'):
    """Compute the contrastive loss L over a batch (vectorized version). No loops are allowed.
    
    Inputs and output are the same as in simclr_loss_naive.
    """
    N = out_left.shape[0]
    
    # Concatenate out_left and out_right into a 2*N x D tensor.
    out = torch.cat([out_left, out_right], dim=0)  # [2*N, D]
    
    # Compute similarity matrix between all pairs of augmented examples in the batch.
    sim_matrix = compute_sim_matrix(out)  # [2*N, 2*N]
    
    ##############################################################################
    # TODO: Start of your code. Follow the hints.                                #
    ##############################################################################
    
    # Step 1: Use sim_matrix to compute the denominator value for all augmented samples.
    # Hint: Compute e^{sim / tau} and store into exponential, which should have shape 2N x 2N.
    exponential = None
    exponential = torch.exp(sim_matrix / tau)
    # This binary mask zeros out terms where k=i.
    mask = (torch.ones_like(exponential, device=device) - torch.eye(2 * N, device=device)).to(device).bool()
    
    # We apply the binary mask.
    exponential = exponential.masked_select(mask).view(2 * N, -1)  # [2*N, 2*N-1]
    
    # Hint: Compute the denominator values for all augmented samples. This should be a 2N x 1 vector.
    denom = None
    denom = torch.sum(exponential, dim = 1, keepdim = True)
    # Step 2: Compute similarity between positive pairs.
    # You can do this in two ways: 
    # Option 1: Extract the corresponding indices from sim_matrix. 
    # Option 2: Use sim_positive_pairs().
    pos_left = sim_positive_pairs(out_left, out_right)
    pos_right = sim_positive_pairs(out_right, out_left)
    pos_pairs = torch.cat([pos_left, pos_right], dim = 0)
    
    # Step 3: Compute the numerator value for all augmented samples.
    numerator = None
    numerator = torch.exp(pos_pairs / tau)
    
    # Step 4: Now that you have the numerator and denominator for all augmented samples, compute the total loss.
    loss = None
    loss = -torch.log(numerator/denom)
    loss = loss.mean()
    ##############################################################################
    #                               END OF YOUR CODE                             #
    ##############################################################################
    
    return loss

Step 1: 모든 쌍 간 similarity 계산 및 softmax 분모 구성

sim_matrix = compute_sim_matrix(out)  # [2N, 2N]
exponential = torch.exp(sim_matrix / tau)
mask = (torch.ones_like(exponential) - torch.eye(2 * N, device=device)).bool()
exponential = exponential.masked_select(mask).view(2 * N, -1)
denom = torch.sum(exponential, dim=1, keepdim=True)  # [2N, 1]

자기 자신과의 similarity sim(z_i,z_i)는 제거
나머지 sim(z_i,z_k), k ≠ i항에 대해 exponential 값 계산
결과: softmax 분모

Step 2: Positive pair 유사도 계산

pos_left = sim_positive_pairs(out_left, out_right)
pos_right = sim_positive_pairs(out_right, out_left)
pos_pairs = torch.cat([pos_left, pos_right], dim = 0)

sim_positive_pairs는 out_left와 out_right 간의 cosine similarity를 계산한 결과
즉, i번째 sample과 그 positive 쌍 i+N의 similarity를 계산한 결과이다.
이때 loss를 한번에 계산하기 위해 l(i, i+N) / l(i+N, i)를 각각 구하고 cat 함수로 합친다.

Step 3: 분자 계산

numerator = None
numerator = torch.exp(pos_pairs / tau)

Step 4: Loss 계산

loss = None
loss = -torch.log(numerator/denom)
loss = loss.mean()

class Model(nn.Module):
    def __init__(self, feature_dim=128):
        super().__init__()

        self.f = []
        for name, module in resnet50().named_children():
            if name == 'conv1':
                module = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)
            if not isinstance(module, nn.Linear) and not isinstance(module, nn.MaxPool2d):
                self.f.append(module)
        # encoder
        self.f = nn.Sequential(*self.f)
        # projection head
        self.g = nn.Sequential(nn.Linear(2048, 512, bias=False), nn.BatchNorm1d(512),
                               nn.ReLU(inplace=True), nn.Linear(512, feature_dim, bias=True))

    def forward(self, x):
        x = self.f(x)
        feature = torch.flatten(x, start_dim=1)
        out = self.g(feature)
        return F.normalize(feature, dim=-1), F.normalize(out, dim=-1)

def train(model, data_loader, train_optimizer, epoch, epochs, batch_size=32, temperature=0.5, device='cuda'):
    """Trains the model defined in ./model.py with one epoch.
    
    Inputs:
    - model: Model class object as defined in ./model.py.
    - data_loader: torch.utils.data.DataLoader object; loads in training data. You can assume the loaded data has been augmented.
    - train_optimizer: torch.optim.Optimizer object; applies an optimizer to training.
    - epoch: integer; current epoch number.
    - epochs: integer; total number of epochs.
    - batch_size: Number of training samples per batch.
    - temperature: float; temperature (tau) parameter used in simclr_loss_vectorized.
    - device: the device name to define torch tensors.

    Returns:
    - The average loss.
    """
    model.train()
    total_loss, total_num, train_bar = 0.0, 0, tqdm(data_loader)
    for data_pair in train_bar:
        x_i, x_j, target = data_pair
        x_i, x_j = x_i.to(device), x_j.to(device)
        
        out_left, out_right, loss = None, None, None
        ##############################################################################
        # TODO: Start of your code.                                                  #
        #                                                                            #
        # Take a look at the model.py file to understand the model's input and output.
        # Run x_i and x_j through the model to get out_left, out_right.              #
        # Then compute the loss using simclr_loss_vectorized.                        #
        ##############################################################################
        _, out_left = model(x_i)
        _, out_right = model(x_j)

        loss = simclr_loss_vectorized(out_left, out_right, temperature, device = device)
        ##############################################################################
        #                               END OF YOUR CODE                             #
        ##############################################################################
        
        train_optimizer.zero_grad()
        loss.backward()
        train_optimizer.step()

        total_num += batch_size
        total_loss += loss.item() * batch_size
        train_bar.set_description('Train Epoch: [{}/{}] Loss: {:.4f}'.format(epoch, epochs, total_loss / total_num))

    return total_loss / total_num


def train_val(model, data_loader, train_optimizer, epoch, epochs, device='cuda'):
    is_train = train_optimizer is not None
    model.train() if is_train else model.eval()
    loss_criterion = torch.nn.CrossEntropyLoss()

    total_loss, total_correct_1, total_correct_5, total_num, data_bar = 0.0, 0.0, 0.0, 0, tqdm(data_loader)
    with (torch.enable_grad() if is_train else torch.no_grad()):
        for data, target in data_bar:
            data, target = data.to(device), target.to(device)
            out = model(data)
            loss = loss_criterion(out, target)

            if is_train:
                train_optimizer.zero_grad()
                loss.backward()
                train_optimizer.step()

            total_num += data.size(0)
            total_loss += loss.item() * data.size(0)
            prediction = torch.argsort(out, dim=-1, descending=True)
            total_correct_1 += torch.sum((prediction[:, 0:1] == target.unsqueeze(dim=-1)).any(dim=-1).float()).item()
            total_correct_5 += torch.sum((prediction[:, 0:5] == target.unsqueeze(dim=-1)).any(dim=-1).float()).item()

            data_bar.set_description('{} Epoch: [{}/{}] Loss: {:.4f} ACC@1: {:.2f}% ACC@5: {:.2f}%'
                                     .format('Train' if is_train else 'Test', epoch, epochs, total_loss / total_num,
                                             total_correct_1 / total_num * 100, total_correct_5 / total_num * 100))

    return total_loss / total_num, total_correct_1 / total_num * 100, total_correct_5 / total_num * 100


def test(model, memory_data_loader, test_data_loader, epoch, epochs, c, temperature=0.5, k=200, device='cuda'):
    model.eval()
    total_top1, total_top5, total_num, feature_bank = 0.0, 0.0, 0, []
    with torch.no_grad():
        # generate feature bank
        for data, _, target in tqdm(memory_data_loader, desc='Feature extracting'):
            feature, out = model(data.to(device))
            feature_bank.append(feature)
        # [D, N]
        feature_bank = torch.cat(feature_bank, dim=0).t().contiguous()
        # [N]
        feature_labels = torch.tensor(memory_data_loader.dataset.targets, device=feature_bank.device)
        # loop test data to predict the label by weighted knn search
        test_bar = tqdm(test_data_loader)
        for data, _, target in test_bar:
            data, target = data.to(device), target.to(device)
            feature, out = model(data)

            total_num += data.size(0)
            # compute cos similarity between each feature vector and feature bank ---> [B, N]
            sim_matrix = torch.mm(feature, feature_bank)
            
            # [B, K]
            sim_weight, sim_indices = sim_matrix.topk(k=k, dim=-1)
            # [B, K]
            sim_labels = torch.gather(feature_labels.expand(data.size(0), -1), dim=-1, index=sim_indices)
            sim_weight = (sim_weight / temperature).exp()

            # counts for each class
            one_hot_label = torch.zeros(data.size(0) * k, c, device=device)
            # [B*K, C]
            one_hot_label = one_hot_label.scatter(dim=-1, index=sim_labels.view(-1, 1), value=1.0)
            # weighted score ---> [B, C]
            pred_scores = torch.sum(one_hot_label.view(data.size(0), -1, c) * sim_weight.unsqueeze(dim=-1), dim=1)

            pred_labels = pred_scores.argsort(dim=-1, descending=True)
            total_top1 += torch.sum((pred_labels[:, :1] == target.unsqueeze(dim=-1)).any(dim=-1).float()).item()
            total_top5 += torch.sum((pred_labels[:, :5] == target.unsqueeze(dim=-1)).any(dim=-1).float()).item()
            test_bar.set_description('Test Epoch: [{}/{}] Acc@1:{:.2f}% Acc@5:{:.2f}%'
                                     .format(epoch, epochs, total_top1 / total_num * 100, total_top5 / total_num * 100))

    return total_top1 / total_num * 100, total_top5 / total_num * 100

class Classifier(nn.Module):
    def __init__(self, num_class):
        super(Classifier, self).__init__()

        # Encoder.
        self.f = Model().f

        # Classifier.
        self.fc = nn.Linear(2048, num_class, bias=True)

    def forward(self, x):
        x = self.f(x)
        feature = torch.flatten(x, start_dim=1)
        out = self.fc(feature)
        return out

# Do not modify this cell.
feature_dim = 128
temperature = 0.5
k = 200
batch_size = 128
epochs = 10
percentage = 0.1

train_transform = compute_train_transform()
train_data = CIFAR10(root='data', train=True, transform=train_transform, download=True)
trainset = torch.utils.data.Subset(train_data, list(np.arange(int(len(train_data)*percentage))))
train_loader = DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=16, pin_memory=True)
test_transform = compute_test_transform()
test_data = CIFAR10(root='data', train=False, transform=test_transform, download=True)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=False, num_workers=16, pin_memory=True)

model = Classifier(num_class=len(train_data.classes)).to(device)
for param in model.f.parameters():
    param.requires_grad = False

flops, params = profile(model, inputs=(torch.randn(1, 3, 32, 32).to(device),))
flops, params = clever_format([flops, params])
print('# Model Params: {} FLOPs: {}'.format(params, flops))
optimizer = optim.Adam(model.fc.parameters(), lr=1e-3, weight_decay=1e-6)
no_pretrain_results = {'train_loss': [], 'train_acc@1': [], 'train_acc@5': [],
           'test_loss': [], 'test_acc@1': [], 'test_acc@5': []}

best_acc = 0.0
for epoch in range(1, epochs + 1):
    train_loss, train_acc_1, train_acc_5 = train_val(model, train_loader, optimizer, epoch, epochs, device='cuda')
    no_pretrain_results['train_loss'].append(train_loss)
    no_pretrain_results['train_acc@1'].append(train_acc_1)
    no_pretrain_results['train_acc@5'].append(train_acc_5)
    test_loss, test_acc_1, test_acc_5 = train_val(model, test_loader, None, epoch, epochs)
    no_pretrain_results['test_loss'].append(test_loss)
    no_pretrain_results['test_acc@1'].append(test_acc_1)
    no_pretrain_results['test_acc@5'].append(test_acc_5)
    if test_acc_1 > best_acc:
        best_acc = test_acc_1

# Print the best test accuracy.
print('Best top-1 accuracy without self-supervised learning: ', best_acc)

[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.
[INFO] Register count_normalization() for <class 'torch.nn.modules.batchnorm.BatchNorm2d'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.activation.ReLU'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.container.Sequential'>.
[INFO] Register count_adap_avgpool() for <class 'torch.nn.modules.pooling.AdaptiveAvgPool2d'>.
[INFO] Register count_linear() for <class 'torch.nn.modules.linear.Linear'>.
# Model Params: 23.52M FLOPs: 1.31G
Train Epoch: [1/10] Loss: 2.5539 ACC@1: 10.70% ACC@5: 51.30%: 100%|██████████| 40/40 [00:10<00:00,  3.96it/s]
Test Epoch: [1/10] Loss: 2.3212 ACC@1: 11.48% ACC@5: 51.60%: 100%|██████████| 79/79 [00:10<00:00,  7.56it/s]
Train Epoch: [2/10] Loss: 2.4299 ACC@1: 10.88% ACC@5: 51.42%: 100%|██████████| 40/40 [00:07<00:00,  5.17it/s]
Test Epoch: [2/10] Loss: 2.7025 ACC@1: 10.18% ACC@5: 55.10%: 100%|██████████| 79/79 [00:11<00:00,  6.77it/s]
Train Epoch: [3/10] Loss: 2.3950 ACC@1: 11.70% ACC@5: 53.12%: 100%|██████████| 40/40 [00:08<00:00,  4.99it/s]
Test Epoch: [3/10] Loss: 2.5049 ACC@1: 10.24% ACC@5: 53.42%: 100%|██████████| 79/79 [00:11<00:00,  6.93it/s]
Train Epoch: [4/10] Loss: 2.4029 ACC@1: 12.44% ACC@5: 54.02%: 100%|██████████| 40/40 [00:09<00:00,  4.17it/s]
Test Epoch: [4/10] Loss: 2.5870 ACC@1: 10.34% ACC@5: 52.39%: 100%|██████████| 79/79 [00:10<00:00,  7.56it/s]
Train Epoch: [5/10] Loss: 2.4127 ACC@1: 12.24% ACC@5: 54.48%: 100%|██████████| 40/40 [00:08<00:00,  4.99it/s]
Test Epoch: [5/10] Loss: 2.7166 ACC@1: 14.82% ACC@5: 54.43%: 100%|██████████| 79/79 [00:10<00:00,  7.39it/s]
Train Epoch: [6/10] Loss: 2.3939 ACC@1: 12.44% ACC@5: 54.02%: 100%|██████████| 40/40 [00:08<00:00,  4.89it/s]
Test Epoch: [6/10] Loss: 2.3872 ACC@1: 13.67% ACC@5: 54.45%: 100%|██████████| 79/79 [00:10<00:00,  7.29it/s]
Train Epoch: [7/10] Loss: 2.3648 ACC@1: 13.10% ACC@5: 54.66%: 100%|██████████| 40/40 [00:09<00:00,  4.40it/s]
Test Epoch: [7/10] Loss: 2.4616 ACC@1: 11.80% ACC@5: 55.54%: 100%|██████████| 79/79 [00:10<00:00,  7.43it/s]
Train Epoch: [8/10] Loss: 2.3864 ACC@1: 11.86% ACC@5: 55.12%: 100%|██████████| 40/40 [00:08<00:00,  4.71it/s]
Test Epoch: [8/10] Loss: 2.4651 ACC@1: 14.32% ACC@5: 59.70%: 100%|██████████| 79/79 [00:11<00:00,  7.00it/s]
Train Epoch: [9/10] Loss: 2.3793 ACC@1: 13.22% ACC@5: 56.60%: 100%|██████████| 40/40 [00:09<00:00,  4.19it/s]
Test Epoch: [9/10] Loss: 2.6685 ACC@1: 10.05% ACC@5: 57.28%: 100%|██████████| 79/79 [00:10<00:00,  7.45it/s]
Train Epoch: [10/10] Loss: 2.4030 ACC@1: 12.96% ACC@5: 57.64%: 100%|██████████| 40/40 [00:08<00:00,  4.97it/s]
Test Epoch: [10/10] Loss: 2.4337 ACC@1: 15.30% ACC@5: 58.28%: 100%|██████████| 79/79 [00:11<00:00,  7.15it/s]Best top-1 accuracy without self-supervised learning:  15.299999999999999

# Do not modify this cell.
feature_dim = 128
temperature = 0.5
k = 200
batch_size = 128
epochs = 10
percentage = 0.1
pretrained_path = './pretrained_model/trained_simclr_model.pth'

train_transform = compute_train_transform()
train_data = CIFAR10(root='data', train=True, transform=train_transform, download=True)
trainset = torch.utils.data.Subset(train_data, list(np.arange(int(len(train_data)*percentage))))
train_loader = DataLoader(trainset, batch_size=batch_size, shuffle=True, num_workers=16, pin_memory=True)
test_transform = compute_test_transform()
test_data = CIFAR10(root='data', train=False, transform=test_transform, download=True)
test_loader = DataLoader(test_data, batch_size=batch_size, shuffle=False, num_workers=16, pin_memory=True)

model = Classifier(num_class=len(train_data.classes))
model.load_state_dict(torch.load(pretrained_path, map_location='cpu'), strict=False)
model = model.to(device)
for param in model.f.parameters():
    param.requires_grad = False

flops, params = profile(model, inputs=(torch.randn(1, 3, 32, 32).to(device),))
flops, params = clever_format([flops, params])
print('# Model Params: {} FLOPs: {}'.format(params, flops))
optimizer = optim.Adam(model.fc.parameters(), lr=1e-3, weight_decay=1e-6)
pretrain_results = {'train_loss': [], 'train_acc@1': [], 'train_acc@5': [],
           'test_loss': [], 'test_acc@1': [], 'test_acc@5': []}

best_acc = 0.0
for epoch in range(1, epochs + 1):
    train_loss, train_acc_1, train_acc_5 = train_val(model, train_loader, optimizer, epoch, epochs)
    pretrain_results['train_loss'].append(train_loss)
    pretrain_results['train_acc@1'].append(train_acc_1)
    pretrain_results['train_acc@5'].append(train_acc_5)
    test_loss, test_acc_1, test_acc_5 = train_val(model, test_loader, None, epoch, epochs)
    pretrain_results['test_loss'].append(test_loss)
    pretrain_results['test_acc@1'].append(test_acc_1)
    pretrain_results['test_acc@5'].append(test_acc_5)
    if test_acc_1 > best_acc:
        best_acc = test_acc_1

# Print the best test accuracy. You should see a best top-1 accuracy of >=70%.
print('Best top-1 accuracy with self-supervised learning: ', best_acc)

[INFO] Register count_convNd() for <class 'torch.nn.modules.conv.Conv2d'>.
[INFO] Register count_normalization() for <class 'torch.nn.modules.batchnorm.BatchNorm2d'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.activation.ReLU'>.
[INFO] Register zero_ops() for <class 'torch.nn.modules.container.Sequential'>.
[INFO] Register count_adap_avgpool() for <class 'torch.nn.modules.pooling.AdaptiveAvgPool2d'>.
[INFO] Register count_linear() for <class 'torch.nn.modules.linear.Linear'>.
# Model Params: 23.52M FLOPs: 1.31G
Train Epoch: [1/10] Loss: 1.8223 ACC@1: 64.98% ACC@5: 93.60%: 100%|██████████| 40/40 [00:09<00:00,  4.18it/s]
Test Epoch: [1/10] Loss: 1.3347 ACC@1: 78.04% ACC@5: 98.18%: 100%|██████████| 79/79 [00:10<00:00,  7.69it/s]
Train Epoch: [2/10] Loss: 1.1898 ACC@1: 76.08% ACC@5: 97.54%: 100%|██████████| 40/40 [00:08<00:00,  4.91it/s]
Test Epoch: [2/10] Loss: 0.9418 ACC@1: 79.12% ACC@5: 98.23%: 100%|██████████| 79/79 [00:11<00:00,  7.04it/s]
Train Epoch: [3/10] Loss: 0.9362 ACC@1: 76.12% ACC@5: 97.86%: 100%|██████████| 40/40 [00:08<00:00,  4.82it/s]
Test Epoch: [3/10] Loss: 0.7820 ACC@1: 79.72% ACC@5: 98.65%: 100%|██████████| 79/79 [00:10<00:00,  7.47it/s]
Train Epoch: [4/10] Loss: 0.8444 ACC@1: 76.82% ACC@5: 97.78%: 100%|██████████| 40/40 [00:08<00:00,  4.57it/s]
Test Epoch: [4/10] Loss: 0.7109 ACC@1: 79.69% ACC@5: 98.57%: 100%|██████████| 79/79 [00:10<00:00,  7.59it/s]
Train Epoch: [5/10] Loss: 0.7692 ACC@1: 77.70% ACC@5: 97.80%: 100%|██████████| 40/40 [00:08<00:00,  4.99it/s]
Test Epoch: [5/10] Loss: 0.6467 ACC@1: 81.11% ACC@5: 98.85%: 100%|██████████| 79/79 [00:11<00:00,  7.16it/s]
Train Epoch: [6/10] Loss: 0.7374 ACC@1: 77.62% ACC@5: 97.90%: 100%|██████████| 40/40 [00:09<00:00,  4.17it/s]
Test Epoch: [6/10] Loss: 0.6100 ACC@1: 81.47% ACC@5: 98.82%: 100%|██████████| 79/79 [00:10<00:00,  7.57it/s]
Train Epoch: [7/10] Loss: 0.6955 ACC@1: 78.10% ACC@5: 98.30%: 100%|██████████| 40/40 [00:07<00:00,  5.12it/s]
Test Epoch: [7/10] Loss: 0.5882 ACC@1: 81.69% ACC@5: 98.89%: 100%|██████████| 79/79 [00:10<00:00,  7.58it/s]
Train Epoch: [8/10] Loss: 0.6796 ACC@1: 78.28% ACC@5: 98.34%: 100%|██████████| 40/40 [00:07<00:00,  5.04it/s]
Test Epoch: [8/10] Loss: 0.5667 ACC@1: 82.23% ACC@5: 98.94%: 100%|██████████| 79/79 [00:10<00:00,  7.43it/s]
Train Epoch: [9/10] Loss: 0.6699 ACC@1: 78.72% ACC@5: 98.14%: 100%|██████████| 40/40 [00:09<00:00,  4.15it/s]
Test Epoch: [9/10] Loss: 0.5560 ACC@1: 82.19% ACC@5: 98.93%: 100%|██████████| 79/79 [00:10<00:00,  7.63it/s]
Train Epoch: [10/10] Loss: 0.6433 ACC@1: 78.94% ACC@5: 98.18%: 100%|██████████| 40/40 [00:08<00:00,  4.87it/s]
Test Epoch: [10/10] Loss: 0.5415 ACC@1: 82.28% ACC@5: 98.95%: 100%|██████████| 79/79 [00:11<00:00,  6.93it/s]Best top-1 accuracy with self-supervised learning:  82.28

728x90

'2. Artificial Intelligence > Stanford Univ. CS231n' 카테고리의 다른 글

[Stanford Univ: CS231n] Spring 2025 Assignment3. Q1(Image Captioning with Transformers) (1)	2025.05.30
[Stanford Univ: CS231n] Lecture 14. Reinforcement Learning (3)	2025.05.29
[Stanford Univ: CS231n] Lecture 13. Generative Models (2)	2025.05.25
[Stanford Univ: CS231n] Spring 2025 Assignment2. Q5(Image Captioning with Vanilla RNNs) (1)	2025.05.23
[Stanford Univ: CS231n] Lecture 12. Visualizing and Understanding (1)	2025.05.23

'2. Artificial Intelligence/Stanford Univ. CS231n' Related Articles

newhaneul

[Stanford Univ: CS231n] Spring 2025 Assignment3. Q2(Self-Supervised Learning for Image Classification) 본문

[Stanford Univ: CS231n] Spring 2025 Assignment3. Q2(Self-Supervised Learning for Image Classification)

Self-Supervised Learning

SimCLR

SimCLR: Data Augmentation

SimCLR: Base Encoder and Projection Head

SimCLR: Contrastive Loss: Vanila

SimCLR: Contrastive Loss: Vectorized

'2. Artificial Intelligence > Stanford Univ. CS231n' 카테고리의 다른 글

티스토리툴바