Linux Tools: split, csplit etc.
Tools for Handling Text
Note A whole page/section elsewhere is devoted to sed, the Stream EDitor.
split*
csplit
Utility csplit outputs pieces of FILE separated by PATTERN(s) to files xx00, xx01.
Syntax:
csplit [OPTION]... FILE PATTERN...
csplit additionally outputs byte counts of each subfile to standard output, but you may suppress this output with .
To repeat the previous pattern as many times as possible add option: {*}
To prevent creating empty output files, use option --elide-empty-files
For instance, to split file my_file into chunks (files) each starting with Error:
, type:
csplit my_file.txt /^Error:/ {*}
If you additionally want the output files to be named subfilexx.txt, type:
csplit --prefix=subfile --suffix-format=%02d.txt cyclopaedia.xhtml /^Error:/ {*}
If you don't want the match to be included in your chunk, use option --suppress-matched This may come in useful when you add comments to be recognized as breaking points, such as [start-new-chunk-here]
and you don't want these comments to be including at the head of a chunk.
...