huffman algorithm | Danick Blouin

I found this project to be highly engaging. Prior to undertaking it, I had limited understanding of compression techniques and how they are applied to files. Throughout the project, the focus on efficiency allowed me to grasp the significance of factors such as avoiding unnecessary if statements and utilizing optimized input streams like BufferedInputStream instead of an FileInputStream. This project served as a valuable learning experience for me as it introduced me to the world of bit manipulation, which was a completely new area of exploration.

Frequency table

To efficiently store the codes in the project, we utilized a HashMap data structure. While HashMaps can be slower with larger data sets, we recognized that our HashMap’s size would be constant, with a maximum of 256, thus not significantly impacting efficiency. Alternatively, we could have stored the nodes in an array, assigning them a parent, and traversed the tree to obtain the codes. However, this approach added complexity without substantial efficiency gains.

Storage

For optimal storage in the compressed file, we chose to only write the node values and their heights in the tree. Using this approach, we could recreate the tree using the “readHeader” function by iterating through the nodes written in the file. Storing the nodes in this manner ensured that each node occupied no more than two bytes. If we had stored the codes directly, they could have taken up more than two bytes, making management challenging. As a result, we achieved approximately 512 bytes reduction compared to the demonstration.

Evaluation

Grade: 100%

Code: 6/6

Report: 5/5

Note: some text has been generated by ChatGPT