How to perform file cutting operation under Linux?

guide It is often due to the limitation of network transmission that we need to   cut large files under the Linux system in many cases. In this way, a large file is cut into multiple small files, transferred, and merged after the transfer is completed.

1. File cutting - split

It is very convenient to use the split command to cut large files under the Linux system 

 [1]  Command syntax

# -a: Specify the suffix length of the output file name (the default is 2: aa, ab...)
# -d: The suffix of the specified output file name is replaced by a number
# -l: Line split mode (specify how many lines are cut into a small file; the default line number is 1000 lines)
# -b: Binary segmentation mode (support unit: k/m)
# -C: file size split mode (try to maintain the integrity of each line when cutting)
split [-a] [-d] [-l <number of lines>] [-b <byte>] [-C <byte>] [file to split] [output filename]

[2] Example of use

# line cut file
$ split -l 300000 users.sql /data/users_
# use numeric suffix
$ split -d -l 300000 users.sql /data/users_
# split by byte size
$ split -d -b 100m users.sql /data/users_

 [3] Help information

# help information
$ split --help
Usage: split [OPTION]... [FILE [PREFIX]]
Output pieces of FILE to PREFIXaa, PREFIXab, ...;
default size is 1000 lines, and default PREFIX is 'x'.
With no FILE, or when FILE is -, read standard input.
Mandatory arguments to long options are mandatory for short options too.
  -a, --suffix-length=N generate suffixes of length N (default 2)
      --additional-suffix=SUFFIX append an additional suffix to file names
  -b, --bytes=SIZE put SIZE bytes per output file size in bytes per output file
  -C, --line-bytes=SIZE put at most SIZE bytes of records per output file
  -d use numeric suffixes starting at 0, not alphabetic Use numeric suffixes instead of alphabetic suffixes
      --numeric-suffixes[=FROM]  same as -d, but allow setting the start value
  -e, --elide-empty-files do not generate empty output files with '-n' do not generate empty output files
      --filter=COMMAND write to  shell  COMMAND; file name is $FILE write to shell command line
  -l, --lines=NUMBER put NUMBER lines/records per output file set the number of lines per output file
  -n, --number=CHUNKS generate CHUNKS output files; see explanation below to generate chunks files
  -t, --separator=SEP use SEP instead of newline as the record separator;
                            '\0' (zero) specifies the NUL character
  -u, --unbuffered immediately copy input to output with '-n r/...' without buffering
      --verbose print a diagnostic just before each show split progress
                            output file is opened
      --help display this help and exit display help information
      --version output version information and exit display version information
The SIZE argument is an integer and optional unit (example: 10K is 10*1024).
Units are K,M,G,T,P,E,Z,Y (powers of 1024) or KB,MB,... (powers of 1000).
CHUNKS may be:
  N       split into N files based on size of input
  K/N     output Kth of N to stdout
  l/N     split into N files without splitting lines/records
  l/K/N   output Kth of N to stdout without splitting lines/records
  r/N     like 'l' but use round robin distribution
  r/K/N   likewise but only output Kth of N to stdout
GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
Full documentation at: <http://www.gnu.org/software/coreutils/split>
or available locally via: info '(coreutils) split invocation'

2. File merge - cat

It is also very convenient to use the cat command to merge multiple small files under the Linux system

[1] Command syntax

# -n: display line number
# -e: end each line with the $ character
# -t: display TAB characters (^I)
cat [-n] [-e] [-t] [output file name]

[2] Example of use

# merge files
$ cat /data/users_* > users.sql

[3] Help information

# help information
$ cat --h
Usage: cat [OPTION]... [FILE]...
Concatenate FILE(s) to standard output.
With no FILE, or when FILE is -, read standard input.
  -A, --show-all           equivalent to -vET
  -b, --number-nonblank    number nonempty output lines, overrides -n
  -e                       equivalent to -vE
  -E, --show-ends          display $ at end of each line
  -n, --number             number all output lines
  -s, --squeeze-blank      suppress repeated empty output lines
  -t                       equivalent to -vT
  -T, --show-tabs          display TAB characters as ^I
  -u                       (ignored)
  -v, --show-nonprinting   use ^ and M- notation, except for LFD and TAB
      --help     display this help and exit
      --version  output version information and exit
Examples:
  cat f - g  Output f's contents, then standard input, then g's contents.
  cat        Copy standard input to standard output.
GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
Full documentation at: <http://www.gnu.org/software/coreutils/cat>
or available locally via: info '(coreutils) cat invocation'

3. Reference documents

  •  Splitting and merging of large files in Linux
  •  Linux Learning – File Splitting and Merging

 

Guess you like

Origin blog.csdn.net/foolere/article/details/131273875