Linux Tools: split, csplit etc.

Tools for Handling Text

Note A whole page/section elsewhere is devoted to sed, the Stream EDitor.

split*

csplit

Utility csplit outputs pieces of FILE separated by PATTERN(s) to files xx00, xx01.

Syntax:

csplit [OPTION]... FILE PATTERN...

csplit additionally outputs byte counts of each subfile to standard output, but you may suppress this output with .


To repeat the previous pattern as many times as possible add option: {*}

To prevent creating empty output files, use option --elide-empty-files


For instance, to split file my_file into chunks (files) each starting with Error:, type:

csplit my_file.txt /^Error:/ {*}

If you additionally want the output files to be named subfilexx.txt, type:

csplit --prefix=subfile --suffix-format=%02d.txt cyclopaedia.xhtml /^Error:/ {*}

If you don't want the match to be included in your chunk, use option --suppress-matched This may come in useful when you add comments to be recognized as breaking points, such as [start-new-chunk-here] and you don't want these comments to be including at the head of a chunk.

...