{"id":5625,"date":"2019-09-21T17:11:12","date_gmt":"2019-09-21T21:11:12","guid":{"rendered":"http:\/\/www.linux-databook.info\/?page_id=5625"},"modified":"2023-08-28T09:40:11","modified_gmt":"2023-08-28T13:40:11","slug":"data-streams","status":"publish","type":"page","link":"http:\/\/www.linux-databook.info\/?page_id=5625","title":{"rendered":"Data Streams"},"content":{"rendered":"\n<p><em><strong>Author\u2019s note:<\/strong> Much of the content in this article is excerpted &#8212; with some significant changes to work in this format &#8212; from Chapter 3, Data Streams, of my book,  <\/em>&#8220;<a rel=\"noreferrer noopener\" aria-label=\"The Linux Philosophy for SysAdmins (opens in a new tab)\" href=\"http:\/\/www.both.org\/?page_id=903\" target=\"_blank\">The Linux Philosophy for SysAdmins<\/a><em>.<\/em>&#8220;<\/p>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<p>Everything in Linux revolves around streams of data \u2013\nparticularly text streams. Data streams are the raw materials upon\nwhich the GNU Utilities, the Linux Core Utilities, and many other\ncommand line tools perform their work. \n<\/p>\n\n\n\n<p>As its name implies, a data stream is a stream of data \u2013 especially text data \u2013 being passed from one file, device, or program to another using STDIO. This article introduces the use of pipes to connect streams of data from one utility program to another using STDIO. You will learn that the function of these programs is to transform the data in some manner. You will also learn about the use of redirection to redirect the data to a file.  <\/p>\n\n\n\n<p>I use the term \u201ctransform\u201d in conjunction with these programs\nbecause the primary task of each is to transform the incoming data\nfrom STDIO in a specific way as intended by the SysAdmin and to send\nthe transformed data to STDOUT for possible use by another\ntransformer program or redirection to a file.<\/p>\n\n\n\n<p>The standard term, \u201cfilters,\u201d implies something with which I\ndon&#8217;t agree. By definition, a filter is a device or a tool that\nremoves something, such as an air filter removes airborne\ncontaminants so that the internal combustion engine of your\nautomobile does not grind itself to death on those particulates. In\nmy high school and college chemistry classes, filter paper was used\nto remove particulates from a liquid. The air filter in my home HVAC\nsystem removes particulates that I don&#8217;t want to breathe.<\/p>\n\n\n\n<p>Although they do sometimes filter out unwanted data from a stream,\nI much prefer the term \u201ctransformers\u201d because these utilities do\nso much more. They can add data to a stream, modify the data in some\namazing ways, sort it, rearrange the data in each line, perform\noperations based on the contents of the data stream, and so much\nmore. Feel free to use whichever term you prefer, but I prefer\ntransformers. I expect that I am alone in this.<\/p>\n\n\n\n<p>Data streams can be manipulated by inserting transformers into the stream using pipes. Each transformer program is used by the SysAdmin to perform some operation on the data in the stream, thus changing its contents in some manner. Redirection can then be used at the end of the pipeline to direct the data stream to a file. As has already been mentioned, that file could be an actual data file on the hard drive, or a device file such as a drive partition, a printer, a terminal, a pseudo-terminal, or any other device connected to a computer.<\/p>\n\n\n\n<p>The ability to manipulate these data streams using these small yet\npowerful transformer programs is central to the power of the Linux\ncommand line interface. Many of the Core Utilities are transformer\nprograms and use STDIO. \n<\/p>\n\n\n\n<p>In the Unix and Linux worlds a stream is a flow text data that\noriginates at some source; the stream may flow to one or more\nprograms that transform it in some way, and then it may be stored in\na file or displayed in a terminal session. As a SysAdmin your job is\nintimately associated with manipulating the creation and flow of\nthese data streams. In this chapter we will explore data streams \u2013\nwhat they are, how to create them, and a little bit about how to use\nthem.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Text streams \u2013 a universal interface<\/h2>\n\n\n\n<p>The use of Standard Input\/Output (STDIO) for program input and\noutput is a key foundation of the Linux way of doing things. STDIO\nwas first developed for Unix and has found its way into most other\noperating systems since then, including DOS, Windows, and Linux.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\"><p><em>\u201cThis is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.\u201d<\/em>  <\/p><cite>Doug McIlroy, Basics of the Unix Philosophy<\/cite><\/blockquote>\n\n\n\n<h2 class=\"wp-block-heading\">STDIO<\/h2>\n\n\n\n<p>STDIO was developed by Ken Thompson as a part of the\ninfrastructure required to implement pipes on early versions of Unix.\nPrograms that implement STDIO use standardized file handles for input\nand output rather than files that are stored on a disk or other\nrecording media. STDIO is best described as a buffered data stream\nand its primary function is to stream data from the output of one\nprogram, file, or device to the input of another program, file, or\ndevice.<\/p>\n\n\n\n<p>There are three STDIO data streams, each of which is automatically\nopened as a file at the startup of a program \u2013 well those programs\nthat use STDIO. Each STDIO data stream is associated with a file\nhandle which is just a set of metadata that describes the attributes\nof the file. File handles 0, 1, and 2 are explicitly defined by\nconvention and long practice as STDIN, STDOUT, and STDERR,\nrespectively.<\/p>\n\n\n\n<p><strong>STDIN, File handle 0<\/strong>, is standard input which is usually\ninput from the keyboard. STDIN can be redirected from any file\nincluding device files instead of the keyboard. It is not common to\nneed to redirect STDIN but it can be done.<\/p>\n\n\n\n<p><strong>STDOUT, File handle 1<\/strong>, is standard output which sends the\ndata stream to the display by default. It is common to redirect\nSTDOUT to a file or to pipe it to another program for further\nprocessing.<\/p>\n\n\n\n<p><strong>STDERR, File handle 2<\/strong>. The data stream for STDERR is also\nusually sent to the display. \n<\/p>\n\n\n\n<p>If STDOUT is redirected to a file, STDERR continues to be\ndisplayed on the screen. This ensures that when the data stream\nitself is not displayed on the terminal, that STDERR is, thus\nensuring that the user will see any errors resulting from execution\nof the program. STDERR can also be redirected to the same or passed\non to the next transformer program in a pipeline.<\/p>\n\n\n\n<p>STDIO is implemented as a C library, stdio.h, which can be\nincluded in the source code of programs so that it can be compiled\ninto the resulting executable. \n<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Simple streams<\/h2>\n\n\n\n<p>You can perform the following experiments safely in the \/tmp\ndirectory of your Linux host. As the root user, make \/tmp the PWD,\ncreate a test directory and then make the new directory the PWD.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">#<strong> cd \/tmp ; mkdir test ; cd test<\/strong><\/pre>\n\n\n\n<p>Enter and run the following command line program to create some\nfiles with content on the drive. We use the dmesg command simply to\nprovide data for the files to contain. The contents don\u2019t matter so\nmuch as just the fact that each file has some content.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">#<strong> for I in 0 1 2 3 4 5 6 7 8 9 ; do dmesg &gt; file$I.txt ; done <\/strong><\/pre>\n\n\n\n<p>Verify that there are now at least 10 files in \/tmp\/ with the\nnames file0.txt through file9.txt. \n<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"><strong># ll<\/strong>\n<strong>total 1320<\/strong>\n<strong>-rw-r--r-- 1 root root 131402 Oct 17 15:50 file0.txt<\/strong>\n<strong>-rw-r--r-- 1 root root 131402 Oct 17 15:50 file1.txt<\/strong>\n<strong>-rw-r--r-- 1 root root 131402 Oct 17 15:50 file2.txt<\/strong>\n<strong>-rw-r--r-- 1 root root 131402 Oct 17 15:50 file3.txt<\/strong>\n<strong>-rw-r--r-- 1 root root 131402 Oct 17 15:50 file4.txt<\/strong>\n<strong>-rw-r--r-- 1 root root 131402 Oct 17 15:50 file5.txt<\/strong>\n<strong>-rw-r--r-- 1 root root 131402 Oct 17 15:50 file6.txt<\/strong>\n<strong>-rw-r--r-- 1 root root 131402 Oct 17 15:50 file7.txt<\/strong>\n<strong>-rw-r--r-- 1 root root 131402 Oct 17 15:50 file8.txt<\/strong>\n<strong>-rw-r--r-- 1 root root 131402 Oct 17 15:50 file9.txt<\/strong> <\/pre>\n\n\n\n<p>We have generated data streams using the dmesg command which was\nredirected to a series of files. Most of the Core Utilities use STDIO\nas their output stream and those that generate data streams, rather\nthan acting to transform the data stream in some way, can be used to\ncreate the data streams that we will use for our experiments. Data\nstreams can be as short as one line or even a single character, and\nas long as needed. \n<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Exploring the hard drive<\/h2>\n\n\n\n<p>It is now time to do a little exploring. In this experiment we\nwill look at some of the filesystem structures.<\/p>\n\n\n\n<p>Let&#8217;s start with something simple. You should be at least somewhat\nfamiliar with the <strong>dd<\/strong> command. Officially known as \u201cdisk\ndump,\u201d many SysAdmins call it \u201cdisk destroyer\u201d for good reason.\nMany of us have inadvertently destroyed the contents of an entire\nhard drive or partition using the <strong>dd<\/strong> command. That is why we\nwill hang out in the \/tmp\/test directory to perform some of these\nexperiments.<\/p>\n\n\n\n<p>Despite its reputation, <strong>dd<\/strong> can be quite useful in exploring\nvarious types of storage media, hard drives, and partitions. We will\nalso use it as a tool to explore other aspects of Linux.<\/p>\n\n\n\n<p>Login to a terminal session as root if you are not already. We\nfirst need to determine the device special file for your hard drive\nusing the <strong>lsblk<\/strong> command. \n<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[root@studentvm1 test]# <strong>lsblk -i<\/strong>\n NAME                                 MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT\n sda                                    8:0    0   60G  0 disk  \n |-sda1                                 8:1    0    1G  0 part \/boot\n `-sda2                                 8:2    0   59G  0 part  \n   |-fedora_studentvm1-pool00_tmeta   253:0    0    4M  0 lvm   \n   | `-fedora_studentvm1-pool00-tpool 253:2    0    2G  0 lvm   \n   |   |-fedora_studentvm1-root       253:3    0    2G  0 lvm  \/\n   |   `-fedora_studentvm1-pool00     253:6    0    2G  0 lvm   \n   |-fedora_studentvm1-pool00_tdata   253:1    0    2G  0 lvm   \n   | `-fedora_studentvm1-pool00-tpool 253:2    0    2G  0 lvm   \n   |   |-fedora_studentvm1-root       253:3    0    2G  0 lvm  \/\n   |   `-fedora_studentvm1-pool00     253:6    0    2G  0 lvm   \n   |-fedora_studentvm1-swap           253:4    0   10G  0 lvm  [SWAP]\n   |-fedora_studentvm1-usr            253:5    0   15G  0 lvm  \/usr\n   |-fedora_studentvm1-home           253:7    0    2G  0 lvm  \/home\n   |-fedora_studentvm1-var            253:8    0   10G  0 lvm  \/var\n   `-fedora_studentvm1-tmp            253:9    0    5G  0 lvm  \/tmp\n sr0                                   11:0    1 1024M  0 rom  <\/pre>\n\n\n\n<p>We can see from this that there is only one hard drive on this\nhost, that the device special file associated with it is \/dev\/sda,\nand that it has two partitions. The \/dev\/sda1 partition is the boot\npartition, and the \/dev\/sda2 partition contains a volume group on\nwhich the rest of the host\u2019s logical volumes have been created.<\/p>\n\n\n\n<p>As root in the terminal session, use the <strong>dd<\/strong> command to view\nthe boot record of the hard drive, assuming it is assigned to the\n\/dev\/sda device. The bs= argument is not what you might think, it\nsimply specifies the block size, and the count= argument specifies\nthe number of blocks to dump to STDIO. The if= argument specifies the\nsource of the data stream, in this case, the \/dev\/sda device. Notice\nthat we are not looking at the first block of the partition, we are\nlooking at the very first block of the hard drive.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[root@studentvm1 test]# <strong>dd if=\/dev\/sda bs=512 count=1<\/strong>\n\ufffdc\ufffd#\ufffd\u043c\ufffd\ufffd\ufffd\u060e\ufffd\ufffd\ufffd|\ufffd#\ufffd#\ufffd\ufffd\ufffd!#\ufffd\ufffd8#u\n                            \ufffd\ufffd#\ufffd\ufffd\ufffdu\ufffd\ufffd#\ufffd#\ufffd#\ufffd|\ufffd\ufffd\ufffdt#\ufffdL#\ufffd#\ufffd|\ufffd\ufffd\ufffd#\ufffd\ufffd\ufffd\ufffd\ufffd\u0080t\ufffd\ufffdpt#\ufffd\ufffd\ufffdy|1\ufffd\ufffd\u060e\u043c \ufffd\ufffdd|&lt;\ufffdt#\ufffd\ufffdR\ufffd|1\ufffd\ufffdD#@\ufffdD\ufffd\ufffdD#\ufffd##f\ufffd#\\|f\ufffdf\ufffd#`|f\ufffd\\\n                                      \ufffdD#p\ufffdB\ufffd#r\ufffdp\ufffd#\ufffdK`#\ufffd#\ufffd\ufffd1\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd#a`\ufffd\ufffd\ufffd#f\ufffd\ufffdu#\ufffd\ufffd\ufffd\ufffdf1\ufffdf\ufffdTCPAf\ufffd#f\ufffd#a\ufffd&amp;Z|\ufffd#}\ufffd#\ufffd.}\ufffd4\ufffd3}\ufffd.\ufffd#\ufffd\ufffdGRUB GeomHard DiskRead Error\n\ufffd#\ufffd\ufffd#\ufffd&lt;u\ufffd\ufffd\u073b\u07ae\ufffd###\ufffd\ufffd\ufffd \ufffd\ufffd\ufffd\ufffd\ufffd\ufffd \ufffd_U\ufffd1+0 records in\n1+0 records out\n512 bytes copied, 4.3856e-05 s, 11.7 MB\/s <\/pre>\n\n\n\n<p>This prints the text of the boot record, which is the first block\non the disk \u2013 any disk. In this case, there is information about\nthe filesystem and, although it is unreadable because it is stored in\nbinary format, the partition table. If this were a bootable device,\nstage 1 of GRUB or some other boot loader would be located in this\nsector. The last three lines contain data about the number of records\nand bytes processed.<\/p>\n\n\n\n<p>Starting with the beginning of \/dev\/sda1, let&#8217;s look at a few\nblocks of data at a time to find what we want. The command is similar\nto the previous one except that we have specified a few more blocks\nof data to view. You may have to specify fewer blocks if your\nterminal is not large enough to display all of the data at one time,\nor you can pipe the data through the less utility and use that to\npage through the data. Either way works. Remember we are doing all of\nthis as root user because non-root users do not have the required\npermissions.<\/p>\n\n\n\n<p>Enter the same command as you did in the previous Experiment, but\nincrease the block count to be displayed to 10 as shown below in\norder to show more data. \n<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[root@studentvm1 test]# <strong>dd if=\/dev\/sda1 bs=512 count=100<\/strong>\n##33\ufffd\ufffd#:\ufffd##\ufffd\ufffd :o\ufffd[:o\ufffd[#\ufffd\ufffdS\ufffd###\ufffdq[#\n                                   #&lt;\ufffd#{5OZh\ufffdGJ\u035e#t\ufffd\u04b0##boot\/bootysimage\/booC\ufffddp\ufffd\ufffdG'\ufffd*)\ufffd#A\ufffd##@\n      #\ufffdq[\n\ufffd## ##  ###\ufffd#\ufffd\ufffd\ufffdTo=###&lt;#8\ufffd\ufffd\ufffd#'#\ufffd###\ufffd#\ufffd\ufffd\ufffd\ufffd\ufffd#\ufffd'  \ufffd\ufffd\ufffd\ufffd\ufffd#Xi  \ufffd#\ufffd\ufffd`  qT\ufffd\ufffd\ufffd\n   &lt;\ufffd\ufffd\ufffd\n       \ufffd  r\ufffd\ufffd\ufffd\ufffd  ]\ufffd#\ufffd#\ufffd##\ufffd##\ufffd##\ufffd#\ufffd##\ufffd##\ufffd##\ufffd#\ufffd##\ufffd##\ufffd#\ufffd\ufffd#\ufffd#\ufffd##\ufffd#\ufffd##\ufffd##\ufffd#\ufffd\ufffd#\ufffd#\ufffd\ufffd\ufffd\ufffd# \ufffd \ufffd# \ufffd# \ufffd#\n\ufffd\n\ufffd#\n\ufffd#\n\ufffd#\n  \ufffd\n   \ufffd#\n     \ufffd#\n       \ufffd#\n         \ufffd\n          \ufffd#\n            \ufffd#\n         \ufffd#100+0 records in\n100+0 records out\n51200 bytes (51 kB, 50 KiB) copied, 0.00117615 s, 43.5 MB\/s <\/pre>\n\n\n\n<p>Now try this command. I won\u2019t reproduce the entire data stream\nhere because it would take up huge amounts of space. Use Ctrl-C to\nbreak out and stop the stream of data.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[root@studentvm1 test]# <strong>dd if=\/dev\/sda<\/strong><\/pre>\n\n\n\n<p>This command produces a stream of data that is the complete\ncontent of the hard drive, \/dev\/sda, including the boot record, the\npartition table, and all of the partitions and their content. This\ndata could be redirected to a file for use as a complete backup from\nwhich a bare metal recovery can be performed. It could also be sent\ndirectly to another hard drive to clone the first. But do not perform\nthis particular experiment.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[root@studentvm1 test]# <strong>dd if=\/dev\/sda of=\/dev\/sdx<\/strong><\/pre>\n\n\n\n<p>You can see that the <strong>dd<\/strong> command can be very useful for\nexploring the structures of various types of filesystems, locating\ndata on a defective storage device, and much more. It also produces a\nstream of data on which we can use the transformer utilities in order\nto modify or view. \n<\/p>\n\n\n\n<p>The real point here is that <strong>dd<\/strong>, like so many Linux commands\nproduces a stream of data as its output. That data stream can be\nsearched and manipulated in many ways using other tools. It can even\nbe used for ghost-like backups or disk duplication.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Randomness<\/h2>\n\n\n\n<p>It turns out that randomness is a desirable thing in computers.\nWho knew. There are a number of reasons that SysAdmins might want to\ngenerate a stream of random data. A stream of random data is\nsometimes useful to overwrite the contents of a complete partition,\nsuch as \/dev\/sda1, or even the entire hard drive as in \/dev\/sda. \n<\/p>\n\n\n\n<p>Perform this experiment as a non-root user. Enter this command to\nprint an unending stream of random data to STDIO. \n<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[student@studentvm1 ~]$ <strong>cat \/dev\/urandom<\/strong><\/pre>\n\n\n\n<p>Use Ctrl-C to break out and stop the stream of data. You may need\nto use Ctrl-C multiple times.<\/p>\n\n\n\n<p>Random data is also used as the input seed to programs that\ngenerate random passwords and random data and numbers for use in\nscientific and statistical calculations. I will cover randomness and\nother interesting data sources in a bit more detail in Chapter 24:\n\u201cEverything is a file.\u201d<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Pipe dreams<\/h2>\n\n\n\n<p>Pipes are critical to our ability to do the amazing things on the\ncommand line, so much so that I think it is important to recognize\nthat they were invented by Douglas McIlroy during the early days of\nUnix.  Thanks, Doug! The Princeton University web site has a fragment\nof an <a href=\"https:\/\/www.princeton.edu\/~hos\/mike\/transcripts\/mcilroy.htm\" target=\"_blank\" rel=\"noreferrer noopener\">interview<\/a>\nwith McIlroy in which he discusses the creation of the pipe and the\nbeginnings of the Unix Philosophy.<\/p>\n\n\n\n<p>Notice the use of pipes in the simple command line program shown\nnext that lists each logged-in user a single time no matter how many\nlogins they have active. Perform this experiment as the student user.\nEnter the command shown below.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[student@studentvm1 ~]$ <strong>w | tail -n +3 | awk '{print $1}' | sort | uniq<\/strong>\nroot\nstudent\n[student@studentvm1 ~]$ <\/pre>\n\n\n\n<p>The results from this command produce two lines of data that show\nthat the users root and student are both logged in. It does not show\nhow many times each user is logged in. Your results will almost\ncertainly differ from mine.<\/p>\n\n\n\n<p>Pipes \u2013 represented by the vertical bar ( | )  \u2013 are the\nsyntactical glue, the operator, that connects these command line\nutilities together. Pipes allow the Standard Output from one command\nto be \u201cpiped\u201d, i.e., streamed from Standard Output of one command\nto the Standard Input of the next command. \n<\/p>\n\n\n\n<p>The |&amp; operator can be used to pipe the STDERR along with\nSTDOUT to STDIN of the next command. This is not always desirable but\nit does offer flexibility in the ability to record the STDERR data\nstream for the purposes of problem determination. \n<\/p>\n\n\n\n<p>A string of programs connected with pipes is called a pipeline and\nthe programs that use STDIO are referred to officially as filters,\nbut I prefer the term transformers.<\/p>\n\n\n\n<p>Think about how this program would have to work if we could not\npipe the data stream from one command to the next. The first command\nwould perform its task on the data and then the output from that\ncommand would have to be saved in a file. The next command would have\nto read the stream of data from the intermediate file and perform its\nmodification of the data stream, sending its own output to a new,\ntemporary data file. The third command would have to take its data\nfrom the second temporary data file and perform its own manipulation\nof the data stream and then store the resulting data stream in yet\nanother temporary file. At each step the data file names would have\nto be transferred from one command to the next in some way.<\/p>\n\n\n\n<p>I cannot even stand to think about that because it is so complex.\nRemember that simplicity rocks!<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Building pipelines<\/h2>\n\n\n\n<p>When I am doing something new, solving a new problem, I usually do\nnot just type in a complete bash command pipeline from scratch off\nthe top of my head. I usually start with just one or two commands in\nthe pipeline and build from there by adding more commands to further\nprocess the data stream. This allows me to view the state of the data\nstream after each of the commands in the pipeline and make\ncorrections as they are needed.<\/p>\n\n\n\n<p>It is possible to build up very complex pipelines that can\ntransform the data stream using many different utilities that work\nwith STDIO. \n<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Redirection<\/h2>\n\n\n\n<p>Redirection is the capability to redirect the STDOUT data stream\nof a program to a file instead of to the default target of the\ndisplay. The \u201cgreater than\u201d ( &gt; ) character, aka, \u201cgt\u201d, is\nthe syntactical symbol for redirection of STDOUT. \n<\/p>\n\n\n\n<p>Redirecting the STDOUT of a command can be used to create a file\ncontaining the results from that command.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[student@studentvm1 ~]$ <strong>df -h &gt; diskusage.txt<\/strong> <\/pre>\n\n\n\n<p>There is no output to the terminal from this command unless there is an error. This is because the STDOUT data stream is redirected to the file and STDERR is still directed to the STDOUT device which is the display. You can view the contents of the file you just created using this next command.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\"> [student@studentvm1 test]# <strong>cat diskusage.txt<\/strong>  \n Filesystem                          Size  Used Avail Use% Mounted on\n devtmpfs                            2.0G     0  2.0G   0% \/dev\n tmpfs                               2.0G     0  2.0G   0% \/dev\/shm\n tmpfs                               2.0G  1.2M  2.0G   1% \/run\n tmpfs                               2.0G     0  2.0G   0% \/sys\/fs\/cgroup\n \/dev\/mapper\/fedora_studentvm1-root  2.0G   50M  1.8G   3% \/\n \/dev\/mapper\/fedora_studentvm1-usr    15G  4.5G  9.5G  33% \/usr\n \/dev\/mapper\/fedora_studentvm1-var   9.8G  1.1G  8.2G  12% \/var\n \/dev\/mapper\/fedora_studentvm1-tmp   4.9G   21M  4.6G   1% \/tmp\n \/dev\/mapper\/fedora_studentvm1-home  2.0G  7.2M  1.8G   1% \/home\n \/dev\/sda1                           976M  221M  689M  25% \/boot\n tmpfs                               395M     0  395M   0% \/run\/user\/0\n tmpfs                               395M   12K  395M   1% \/run\/user\/1000 <\/pre>\n\n\n\n<p>When using the &gt; symbol to redirect the data stream, the\nspecified file is created if it does not already exist. If it already\ndoes exist the contents are overwritten by the data stream from the\ncommand. You can use double greater than symbols, &gt;&gt;, to append\nthe new data stream to any existing content in the file.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">[student@studentvm1 ~]$ <strong>df -h &gt;&gt; diskusage.txt<\/strong><\/pre>\n\n\n\n<p>You can use cat and\/or less to view the diskusage.txt file in\norder to verify that the new data was appended to the end of the\nfile.<\/p>\n\n\n\n<p>The &lt; (less than) symbol redirects data to the STDIN of the\nprogram. You might want to use this method to input data from a file\nto STDIN of a command that does not take a filename as an argument\nbut that does use STDIN. Although input sources can be redirected to\nSTDIN, such as a file that is used as input to grep, it is generally\nnot necessary as grep also takes a filename as an argument to specify\nthe input source. Most other commands also take a filename as an\nargument for their input source. \n<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Just grep\u2019ing around<\/h2>\n\n\n\n<p>The grep command is used to select lines that match a specified\npattern from a stream of data. grep is one of the most commonly used\ntransformer utilities and can be used in some very creative and\ninteresting ways. The grep command is one of the few that can\ncorrectly be called a filter because it does filter out all the lines\nof the data stream that you do not want; it leaves only the lines\nthat you do want in the remaining data stream.<\/p>\n\n\n\n<p>If the PWD is not the \/tmp\/test directory, make it so. Let\u2019s\nfirst create a stream of random data to store in a file. In this case\nwe want somewhat less random data that would be limited to printable\ncharacters. A good password generator program can do this. The\nfollowing program (you may have to install pwgen if it is not\nalready) creates a file that contains 50,000 passwords that are 80\ncharacters long using every printable character. Try it without\nredirecting to the random.txt file forst to see what that looks like,\nand then do it once redirecting the output data stream to the file.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ <strong>pwgen -sy 80 50000 &gt; random.txt<\/strong><\/pre>\n\n\n\n<p>Considering that there are so many passwords, it is very likely\nthat some character strings in them are the same. First, cat the\nrandom.txt file, then use the grep command to locate some short,\nrandomly selected strings from the last ten passwords on the screen.\nI saw the word \u201csee\u201d in one of those ten passwords, so my command\nlooked like this: grep see random.txt and you can try that, but you\nshould also pick some strings of your own to check. Short strings of\n2 to 4 characters work best.<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">$ <strong>grep see random.txt <\/strong>\nR=p)'s\/~0}wr~2(OqaL.S7DNyxlmO69`\"12u]h@rp[D2%3}1b87+&gt;Vk,;4a0hX]d7<strong>see<\/strong>;1%9|wMp6Yl.\nbSM_mt_hPy|YZ1&lt;TY\/Hu5{g#mQ&lt;u_(@8B5Vt?w%i-&amp;C&gt;NU@[;zV2-<strong>see<\/strong>)&gt;(BSK~n5mmb9~h)yx{a&amp;$_e\ncjR1QWZwEgl48[3i-(^x9D=v)<strong>see<\/strong>YT2R#M:&gt;wDh?Tn$]HZU7}j!7bIiIr^cI.DI)W0D\"'vZU@.Kxd1E1\nz=tXcjVv^G\\nW`,y=bED]d|7%s6iYT^a^Bv<strong>see<\/strong>:v\\UmWT02|P|nq%A*;+Ng[$S%*s)-ls\"dUfo|0P5+n <\/pre>\n\n\n\n<h2 class=\"wp-block-heading\">Summary<\/h2>\n\n\n\n<p>It is the use of pipes and redirection that allows many of the\namazing and powerful tasks that can be performed with data streams on\nthe Linux command line. It is pipes that transport STDIO data streams\nfrom one program or file to another. The ability to pipe streams of\ndata through one or more transformer programs supports powerful and\nflexible manipulation of data in those streams. \n<\/p>\n\n\n\n<p>Each of the programs in the pipelines demonstrated in these experiments is small and each does one thing well. They are also transformers, that is they take Standard Input, process it in some way, and then send the result to Standard Output. Implementation of these programs as transformers to send processed data streams from their own Standard Output to the Standard Input of the other programs is complementary to and necessary for the implementation of pipes as a Linux tool.  <\/p>\n\n\n\n<p>STDIO is nothing more than streams of data. This data can be\nalmost anything from the output of a command to list the files in a\ndirectory, or an unending stream of data from a special device like\n\/dev\/urandom, or even a stream that contains all of the raw data from\na hard drive or a partition. \n<\/p>\n\n\n\n<p>Any device on a Linux computer can be treated like a data stream.\nYou can use ordinary tools like <strong>dd<\/strong> and <strong>cat<\/strong> to dump data\nfrom a device into a STDIO data stream that can be processed using\nother ordinary Linux tools. \n<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Author\u2019s note: Much of the content in this article is excerpted &#8212; with some significant changes to work in this format &#8212; from Chapter 3, Data Streams, of my book, &#8220;The Linux Philosophy for SysAdmins.&#8220; Everything in Linux revolves around&hellip;<\/p>\n<p class=\"more-link-p\"><a class=\"more-link\" href=\"http:\/\/www.linux-databook.info\/?page_id=5625\">Read more &rarr;<\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"parent":5587,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-5625","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"http:\/\/www.linux-databook.info\/index.php?rest_route=\/wp\/v2\/pages\/5625","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.linux-databook.info\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"http:\/\/www.linux-databook.info\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"http:\/\/www.linux-databook.info\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"http:\/\/www.linux-databook.info\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5625"}],"version-history":[{"count":12,"href":"http:\/\/www.linux-databook.info\/index.php?rest_route=\/wp\/v2\/pages\/5625\/revisions"}],"predecessor-version":[{"id":5787,"href":"http:\/\/www.linux-databook.info\/index.php?rest_route=\/wp\/v2\/pages\/5625\/revisions\/5787"}],"up":[{"embeddable":true,"href":"http:\/\/www.linux-databook.info\/index.php?rest_route=\/wp\/v2\/pages\/5587"}],"wp:attachment":[{"href":"http:\/\/www.linux-databook.info\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5625"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}