Multitudes, an IPython notebook series on what's inside our files

Do I contradict myself? Very well then I contradict myself, (I am large, I contain multitudes.) Walt Whitman, Songs of Myself

I told a colleague I was curious about "perceptual image hashing," and we got to talking about image formats and how perceptual hashes aren't much different from a rather lossy compression, unlike the cryptographic hashes I was more aware of. This brought us to JPEG 2000 and how it makes it very clear how the format uses Discrete Cosine Transformations to encode images.

It lead me to realize that I only have a small familiarity with what goes inside a file. I get that there's structs in programs, that make bytes, that go into a file, but that's all rather arbitrary. What are the decisions for packing data in a way that other programs and machines understand? Ideally, in a way that's concise and not too intense to process or implement. Naturally, different formats optimize for different concerns. An uncompressed bitmap image is easier to decode than a JPEG, but certainly takes up more disk space and bandwidth.

Ideas for formats to look at:

Image formats:
- BMP
- GIF
- JPEG
- PNG
Lossless Compression schemes:
- GZIP
- Others? I need to research
Audio, Lossy and Lossless:
- WAV
- FLAC
- OGG
- MP3
Video and media containers
- MPEG
- AVI
- MKV
- Subtitle encoding
Documents
- PDF
- DOCX
- PPT
- XLS
- Open/Libre Office

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
0_what_is_a_file		0_what_is_a_file
1_wav_lets_make_some_noise		1_wav_lets_make_some_noise
2_bitmap_take_a_good_look		2_bitmap_take_a_good_look
README.MD		README.MD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multitudes, an IPython notebook series on what's inside our files

About

Releases

Packages

Languages

mccartykim/multitudes

Folders and files

Latest commit

History

Repository files navigation

Multitudes, an IPython notebook series on what's inside our files

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages