dd Study Notes

dd

dd is a low-level utility that copies data from an input source to an output destination while optionally transforming the data. It is designed for raw byte/block operations, not file-by-file logic like cp

The dd command stands for Data Description, Data Duplicator, or Data Definition, originating from IBM’s JCL (Job Control Language) to convert and copy files. It is a low-level utility used for copying raw data, creating disk images, and backups, often jokingly referred to as “disk destroyer” due to its ability to overwrite data

In short: A tool that copies raw data block-by-block, not file-by-file.


Core idea

dd follows this pattern:

input → (optional transform) → output

You control:

  • where input comes from (if=)
  • where output goes (of=)
  • how big chunks are (bs=)

Safe example

This does NOT touch disks. It just copies a file.

Create a test file:

echo "hello dd" > test.txt

Copy it using dd:

dd if=test.txt of=copy.txt

Check result:

cat copy.txt

You should see:

hello dd

Add block size

dd if=test.txt of=copy2.txt bs=1

Meaning:

  • read 1 byte at a time
  • write 1 byte at a time

count=

dd if=test.txt of=partial.txt bs=1 count=4

This means copy only 4 bytes


Option Meaning
if= input file
of= output file
bs= block size
count= how many blocks to copy

Blocks

dd does not think in files but in fixed-size chunks (blocks)

Key parameters:

  • ibs= input block size
  • obs= output block size
  • bs= sets both input and output block size

Default block size is typically 512 bytes

Example:

dd if=input bs=1M of=output

reads and writes in 1 MB chunks


Input/output selection

  • if=file input file
  • of=file output file

Example:

dd if=/dev/sda of=backup.img

Copies raw disk data.


Skipping and limiting data

These control position and size of copy:

  • skip=n skip input blocks before reading
  • seek=n skip output blocks before writing
  • count=n copy only n input blocks

Example:

dd if=file of=out bs=1M skip=10 count=5

copy 5 MB starting after 10 MB offset


conv=

A. Byte-level transformations

  • swab swap every pair of bytes
  • lcase uppercase → lowercase
  • ucase lowercase → uppercase

B. Record/line transformations

  • block fixed-length records (pad with spaces)
  • unblock remove padding spaces, add newline

C. Padding / structure handling

  • sync pad input blocks (usually with zeros or spaces)
  • sparse create holes instead of writing zero blocks

D. Character encoding conversions

  • ascii
  • ebcdic
  • ibm

These convert between legacy character encodings.

E. Output control behaviors (not data conversion)

  • notrunc do not truncate output file
  • noerror continue on read errors
  • fdatasync flush data to disk
  • fsync flush data + metadata

File access flags iflag=, oflag=

These affect how files are accessed, not data content.

Common input flags (iflag=)

  • fullblock → ensure full blocks are read
  • nocache → bypass cache
  • nonblock → non-blocking I/O

Output flags (oflag=)

  • append → always append to file
  • direct → bypass buffer cache
  • sync → synchronous writes
  • dsync → sync data only

Sparse file handling

  • conv=sparse

If output contains long runs of zeros:

  • Instead of writing zeros
  • dd creates holes in the file (saves disk space)

Data integrity and error handling

  • noerror → continue even if read fails
  • sync → pad bad reads with zeros
  • Useful for disk recovery

Example:

dd if=/dev/sda of=recovery.img conv=noerror,sync

attempts to salvage data from failing disk


Progress reporting

  • status=progress → continuous updates
  • status=none → silent mode
  • default → final summary only

Output example:

1000+0 records in
1000+0 records out
512000 bytes copied

Meaning:

w+p format:

  • w = full blocks
  • p = partial blocks

if=file

Read from file instead of standard input.

of=file

Write to file instead of standard output. Unless ‘conv=notrunc’ is given, truncate file before writing it.

ibs=bytes

Set the input block size to bytes. This makes dd read bytes per block. The default is 512 bytes.

obs=bytes

Set the output block size to bytes. This makes dd write bytes per block. The default is 512 bytes.

bs=bytes

Set both input and output block sizes to bytes. This makes dd read and write bytes per block, overriding any ‘ibs’ and ‘obs’ settings. In addition, if no data-transforming conv operand is specified, input is copied to the output as soon as it’s read, even if it is smaller than the block size.

cbs=bytes

Set the conversion block size to bytes. When converting variable-length records to fixed-length ones (conv=block) or the reverse (conv=unblock), use bytes as the fixed record length. ‘skip=n’ ‘iseek=n’

Skip n ‘ibs’-byte blocks in the input file before copying. If n ends in the letter ‘B’, interpret n as a byte count rather than a block count. (‘B’ and the ‘iseek=’ spelling are GNU extensions to POSIX.) ‘seek=n’ ‘oseek=n’

Skip n ‘obs’-byte blocks in the output file before truncating or copying. If n ends in the letter ‘B’, interpret n as a byte count rather than a block count. (‘B’ and the ‘oseek=’ spelling are GNU extensions to POSIX.) ‘count=n’

Copy n ‘ibs’-byte blocks from the input file, instead of everything until the end of the file. If n ends in the letter ‘B’, interpret n as a byte count rather than a block count; this is a GNU extension to POSIX. If short reads occur, as could be the case when reading from a pipe for example, ‘iflag=fullblock’ ensures that ‘count=’ counts complete input blocks rather than input read operations. As an extension to POSIX, ‘count=0’ copies zero blocks instead of copying all blocks. ‘status=level’

Specify the amount of information printed. If this operand is given multiple times, the last one takes precedence. The level value can be one of the following:

none

Do not print any informational or warning messages to standard error. Error messages are output as normal.

noxfer

Do not print the final transfer rate and volume statistics that normally make up the last status line.

progress

Print the transfer rate and volume statistics on standard error, when processing each input block. Statistics are output on a single line at most once every second, but updates can be delayed when waiting on I/O.