CSCI 3080 - Discrete Structures

Lab 4 - Compressing Text Files using Huffman Coding

1. Objective

In this lab, students will implement Huffman coding in C++ to compress and decompress text files. They will learn how to construct a Huffman tree, generate Huffman codes, encode a text file into a compressed binary format, and then decode it back to its original form. Additionally, students will analyze the efficiency of Huffman compression by comparing the file sizes before and after compression. By the end of this lab, students will be able to:

2. Lab Instructions

Step 1: Read the Input File

Step 2: Compute Character Frequencies

Step 3: Defining and Creating Huffman Tree Nodes

Reference for Step 3: See InOrder.cpp for a related example of creating new nodes and deleting nodes.

Step 4: Build the Huffman Tree

Step 5: Generate Huffman Codes

Step 6: Encode the Text File

Step 7: Decode the Huffman encoded text

Step 8: Analyze Compression Efficiency

3. Requirements

4. Sample Input/Output

        Enter text file name: input.txt
        
        Huffman Codes:
        p : 1111101
        f : 111111
        I : 11111001
        v : 11111000
        g : 111101
        m : 111100
        i : 11101
        c : 11100
          : 110
        l : 10111
        q : 1011011
        T : 10110100
        u : 101100
        s : 000
        d : 0010
        n : 0011
        a : 1000
        H : 10110101
        e : 010
        r : 0111
        t : 0110
        b : 10010110
        o : 1010
        z : 1001010
        y : 10010111
        . : 100100
        h : 10011
        
        Encoded Text (first 100 bits): 1011010110110011111111111111110010000011110111001010001011101001111110111011101000110100011010111101...
        
        Original file size: 212 bytes
        
        Compressed file size: 110 bytes (approx)
        Compressed output saved to 'compressed.bin'
        
        Decompressed file size: 212 bytes
        Decompressed output saved to 'decompressed.txt'
        
        Deleting nodes:  s d n e t r a . z b y h o u T H q l   c i m g v I p f 
    

5. Grading Criteria

Grading Criteria (Total Points: 100)
Category Criteria Points
Huffman Tree Node
(5 points)
The Huffman Tree Node is defined correctly. 5
New Node Function
(5 points)
The function to create a new node is defined correctly. 5
Huffman tree
(17 points)
The Huffman tree is constructed correctly. 17
Huffman encoding
(15 points)
Huffman encoding has been successfully implemented 15
Huffman decoding
(15 points)
Huffman decoding has been successfully implemented 15
File I/O operations
(15 points)
File I/O operations work correctly 15
Main Function Implementation
(10 points)
All functions are correctly invoked in the main function 10
Analyze compression efficiency
(5 points)
Accurately compare the sizes of input.txt, compressed.bin, and decompressed.txt 5
Code Readability & Comments
(5 points)
Maintain consistent indentation and spacing throughout the code. 2
Include comments that explain the purpose of key sections and logic. 3
Test Cases & Output Accuracy
(5 points)
The output for different test cases is correct and consistent. 5
AI disclaimer
(3 points)
Please put an AI disclaimer at the top of your source code as comments. 3
Errors
Your program has syntax errors, compiling errors, running errors, or infinite loops. -50 points

6. Comments

7. Submit

Please submit your Lab 4 source code file: lab4.cpp in D2L!

8. Jupyter Hub for Programming

Jupyter Hub for Programming
By logging in to Jupyter Hub with your MTSU credentials, you can create, compile, and run your source code there.

9. Compile and Run the Source Code


Congratulations! You have finished your Lab4!