
Base64 is a binary-to-text encoding methodology that helps represent binary data in ASCII string format. It’s often used to encode data for transmission over media that are mostly text, like emails, JSON-based APIs, etc., so that binary data like images and files don’t get corrupted. The term Base64 comes from the fact that it uses 64 characters – A-Z, a-z, 0-9, +, and / to represent data. In recent years, it has been widely used in multimodal AI applications, embedded systems, cloud-based services, and web development. In this article, we’ll learn more about Base64 and how to use it.
Why Base64?
Base64 is mostly used in cases where binary data (e.g., images, videos, model weights, etc.) needs to be passed through text-based infrastructures without being altered or corrupted. But why is it a popular choice amongst so many other types of encodings? Let’s try to understand.
Base64 is:
- Text-safe: Can embed binary data in text-based formats like HTML, XML, JSON, etc.
- Easy to transport: No issues with character encoding or data corruption.
- Common for images: Often used in web development to embed images directly in HTML/CSS or JSON payloads.
And here’s how other famous encodings are compared to Base64.
Encoding | Purpose | Use Case | Size Impact |
Base64 | Binary to text | Embedding images/files in HTML, JSON, etc. | ~33% increase |
Hex | Binary to Hexadecimal | Debugging, network traces | ~100% increase |
Gzip | Compression | Actual size reduction for text/binary | Compression ratio-dependent |
Also Read: What are Categorical Data Encoding Methods | Binary Encoding
How Does Base64 Work?
Now let’s try to understand how Base64 works. Here’s a walkthrough of the step-by-step conversion of the string “Hello” into its Base64 format.
Step 1: Convert the Text to ASCII Bytes
Character | ASCII Decimal Value | Binary Value (8 bits) |
H | 72 | 01001000 |
e | 101 | 01100101 |
l | 108 | 01101100 |
l | 108 | 01101100 |
o | 111 | 01101111 |
So now, our string “Hello” would look like 01001000 01100101 01101100 01101100 01101111.
That’s 5 characters × 8 bits = 40 bits.
Step 2: Break the Binary into 6-bit Groups
Base64 operates on 6-bit blocks, so we group the 40 bits into chunks of 6 which was previously in chunks of 8:
01001000 01100101 01101100 01101100 01101111
When these chunks of 8 are grouped in groups of 6 they look like this:
010010 000110 010101 101100 011011 000110 1111
Since 40 isn’t directly divisible by 6, we have to pad some 0s at the end. We now have 6 full 6-bit blocks and 1 leftover 4-bit block. We pad the last block with 2 zero bits to make it a full 6-bit chunk:
010010 000110 010101 101100 011011 000110 111100
Step 3: Convert 6-bit Groups to Decimal
We know 2^6 is 64. So, our range will be in between 0 to 63.
6-bit binary | Decimal |
010010 | 18 |
000110 | 6 |
010101 | 21 |
101100 | 44 |
011011 | 27 |
000110 | 6 |
111100 | 60 |
Step 4: Map to Base64 Characters
Following the standard Base64 character table, we will map our decimal values to the corresponding characters.

Decimal | Base64 Character |
18 | S |
6 | G |
21 | V |
44 | s |
27 | b |
6 | G |
60 | 8 |
We get “SGVsbG8” as our Base64 encoding for our string “Hello”.
Step 5: Add Padding
Since our original string had 5 bytes (not a multiple of 3), Base64 requires padding with “=” to make the output length a multiple of 4 characters.
5 bytes = 40 bits -> 6 full base64 chars + 2 more characters (from padded bits) -> Total 8 characters
Final Base64 encoded string: “Hello” -> SGVsbG8=
Also Read: Complete Guide on Encoding Numerical Features in Machine Learning
Python Implementation of Base64
Now that you understand how Base64 works, let me show you how to implement it in Python. We’ll first try to encode and decode some text, and then do the same with an image.
Encoding and Decoding Text
Let’s encode this simple text using Base64 and then decode the encoded string back to its original form.
import base64
# Text encoding
message = "Hello World"
encoded = base64.b64encode(message.encode())
print("Encoded:", encoded)
# Decoding it back
decoded = base64.b64decode(encoded).decode()
print("Decoded:", decoded)
Output

Encoding and Decoding Images
In vision-related applications, especially with Vision Language Models (VLMs), images are often encoded in Base64 when:
- Transmitting images via JSON payloads to or from APIs.
- Embedding images for training and serving multimodal models.
- Using CLIP, BLIP, LLaVA or other Vision-Language Transformers that accept images as serialized Base64 strings.
Here’s a simple Python code to encode and decode Images.
from PIL import Image
import base64
import io
# Load and encode image
img = Image.open("example.jpeg")
buffered = io.BytesIO()
img.save(buffered, format="JPEG")
img_bytes = buffered.getvalue()
img_base64 = base64.b64encode(img_bytes).decode('utf-8')
print("Base64 String:", img_base64[:100], "...") # Truncated
Output

We can also decode our base 64 encoded data back to the image using the below code.
from PIL import Image
import base64
import io
from IPython.display import display, Image as IPythonImage
# Assume `img_base64` is the base64 string
img_data = base64.b64decode(img_base64)
img = Image.open(io.BytesIO(img_data))
display(IPythonImage(data=img_data))
Output

To learn more about Base64 and find many more encoders and decoders, you can refer this site.
Things to Keep in Mind While Using Base64
Although Base64 is of great use in various use cases across domains, here are a few things to note while working with it.
- Size Overhead (~33%): For every 3 bytes of binary, you output 4 bytes of text. On large batches (e.g., thousands of high‑res frames), this can consume network and storage bandwidth quickly. Consider compressing images (JPEG/PNG) before Base64 and using streaming if possible.
- Memory & CPU Load: Converting and buffering an entire image at once can spike overall memory usage during encoding. Similarly, decoding into raw bytes and then parsing via an image library also adds CPU overhead.
- Not a Compression Algorithm: Base64 doesn’t reduce size, it inflates it. Always apply true compression (e.g., JPEG, WebP) on the binary data before encoding to Base64.
- Security Considerations: If we blindly concatenate Base64 strings into HTML or JSON without cleaning, you could open XSS or JSON‑injection vectors. Also, extremely large Base64 data can exhaust the parsers and enforce maximum payload sizes at the gateway.
Conclusion
In an era where models can “see” as well as “read”, Base64 has quietly become a cornerstone of multimodal systems. It plays a very important role in data encoding by bridging the gap between binary data and text‑only systems. In vision‑language workflows, it standardizes how images travel from mobile clients to cloud GPUs, while preserving reproducibility and easing integration.
Making images compatible with text-based infrastructure has always been a complex problem to solve. Base64 encoding provides a practical solution to this, enabling image transmission over APIs and packaging datasets for training.
Login to continue reading and enjoy expert-curated content.