Differences

This shows you the differences between two versions of the page.

--- unix_102 [2016/05/24 15:28]
peek [Wildcards]
+++ unix_102 [2016/06/07 13:57] (current)
peek
@@ Line 3: / Line 3: @@
 ====== Required Reading ======
-  - [[unix_102|Unix 102 -- Some basic Unix commands]]
+  - [[unix_101|Unix 101]]
+  - [[unix_commands|Unix Commands]]
-===== Modifying Program Behavior =====
+====== Dotfiles ======
-Many commands will run without any additional information from the user, but most options will also allow you to modify it's behavior by passing it new information.  Obviously one method of passing information from the user to a program is for the program to print a question to the screen and wait for the user to respond.  However, that's not a very efficient method of interaction if you're trying to automate your workflow.  A better method would be for the user to give the program access to everything it needs to know up front so that the computer doesn't have to stop and ask questions.  This also allows long programs or scripts to run overnight while the user goes home, which is very important for users who, for instance, want to process large amounts of data or run large numbers of simulations.
+A mention should be made to the existence of dotfiles, or files/directories whose names begin with a '.'.  If you take a close look at the output of the ls command earlier in this document, you will notice that some invocations of ls listed these files and some did not.  In particular, the command ''ls'' by itself did not list dotfiles.  This is because dotfiles are "hidden files" under Unix.  They are shown if you pass special command line arguments to ls, but otherwise ls will not show them to you.  Dotfiles are typically used to hold configuration information used by the shell and other programs to store your user-specific preferences and data.
-==== Options and Arguments ====
+Dotfiles are almost always plain text files.  This means that one way to change the behavior of a program is to edit the dotfiles that the program uses, and you can do this with any editor you choose.  A word of warning however: incorrect content in a dotfile may result in unexpected behavior.  You should always make a backup of the file you are about to change so that, if needed, you can restore the original version should something go terribly wrong.
-One method of passing extra information to a program is to add this information to the **command line** when you type your command into the shell.  The command line is the entire string of characters that you type into the shell, starting with the first character and ending when you press return.
+The easiest way to find out what dotfiles are used by a program is to check the program's manual, help, or info pages.
-Command line options for a program can usually be found by either typing the command, then a space, and then typing <code>-h</code> or <code>--help</code>.  Most programs will respond by giving you additional information about options and arguments that can be passed to the command on the command line.  Another way is to type <code>man <command></code> to see if the program comes with it's own documentation.
+====== Parents, Children, Orphans, and Zombies ======
-Most command line options in Unix are prefixed by a single or double dash (''-'' or ''--'').  (Note: The difference between single and double dashes can be difficult to see on this web page.  I'll let you know if something is single or double.)
+It is often the case that a program will execute another program to either replace itself or to run along side itself.  In fact, this is so common, that it's the first thing that happens on a UNIX machine on bootup.  The very first program run by the system is usually called ''init'', and it takes care of starting up all of the other programs that make your system run.
-Examples of command line arguments:
+  * Every program has an associated process identification number, called PID.  You can see a list of all of the processes running on a computer with the ''ps -ef'' command.  The ''ps'' command shows you the **process table** -- a special table kept by the kernel to keep up with all of the processes currently running on the computer.  If you only want to see the processes running in your shell, just type ''ps'' without the ''-ef'' command line arguments.
+  * When a program executes another program, the executor is called the **parent**, and the executed the **child**.  The parent process owns the child process, and the child process inherits it's execution environment from it's parent.  (Don't worry if you don't know what "execution environment" means.)
+  * The ''init'' program is the great ancestor -- every program has ''init'' somewhere in it's linage.  If a parent program dies, the child becomes owned by ''init''.
+  * If a program exits, but it still has an entry in the process table, it's called a **zombie**.  This will have almost no effect on you or your UNIX experience.  Ever.  I just wanted a reason to bring zombies into the conversation.
+====== Managing Running Commands ======
+When you run a command, that command takes over the terminal.  It expects to read input from the keyboard and write output to the screen.  In the meantime, your shell is sitting there waiting for the command to exit.  But how do you regain control if a program goes haywire?  Or what if your command is just taking a very long time to complete, and you want to get something else done while you wait?
+  * The behavior where a command takes over the keyboard and screen is called "running a command in the **foreground**".  This is the default behavior of commands unless told otherwise.
+  * If you want to interrupt a command, you can hold down the control key and press the "C" key (''control-c'').  This will kill the command.  (NOTE: If the command is held up by some kernel function, like performing I/O for instance, then the program cannot quit until the kernel first returns control to the program so that it can.)
+  * If you expect a program to take a long time, then you can run it as a **background** process.  Background processes should not expect keyboard input.  They may be fed input from a file (see the section on redirection and pipes below).  A process can be run in the background by appending a space and a ''&'' symbol at the end of the command line.
+  * If you are running a program in the foreground, but it's taking too long, and you want to switch it to the background, you can do so by holding down the control key and pressing "Z" (''ontrol-Z'').  This will interrupt the program without killing it.  You can then type the command ''bg'' to tell the shell to resume running the program in the background.  If you type control-Z and decide that you made a mistake, you can continue running the program in the foreground by typing ''fg''.
+  * If you have one or more programs running in the background, you can view a list of background processes by typing ''jobs''.
+For example, I'm going to use the ''sleep'' command to simulate a long-running process:
 <code>
-peek@catus:~/Documents/LinuxTutorial$ ls
+peek@catus:~$ # I can type something here and the shell will ignore it because
-Aware_-_Kontinuum.flac	log-messages.txt				README
+peek@catus:~$ # the line begins with a '#'.  In shell scripting, a line beginning
-Glaciers-SD.mp4		MendelMax_3_Full_Kit_Packing_Slip_-_Sheet1.csv
+peek@catus:~$ # with a '#' is called a comment line.
+peek@catus:~$ # This lets me tell you what I'm doing without confusing the shell
+peek@catus:~$ # First, I'm going to start a program that's going to take a long time:
+peek@catus:~$ sleep 300
+^Z
+[1]+  Stopped                 sleep 300
+peek@catus:~$ # I just pressed control-Z to interrupt the sleep command.
+peek@catus:~$ # The sleep command just sits and does nothing for the number of
+peek@catus:~$ # seconds I tell it to.
+peek@catus:~$ # In this case, the sleep command is going to sit there for 5 minutes
+peek@catus:~$ # (300 seconds). I didn't want to wait 5 minutes, so I'm going to move
+peek@catus:~$ # the command into the background with the 'bg' command
+peek@catus:~$ bg
+[1]+ sleep 300 &
+peek@catus:~$ # Now the sleep command is running again, but it's running in the
+peek@catus:~$ # background, which frees up the shell for me to run other commands.
+peek@catus:~$ # Now I'll run another one:
+peek@catus:~$ sleep 600
+^Z
+[2]+  Stopped                 sleep 600
+peek@catus:~$ # This command will run for 10 minutes (600 seconds).
+peek@catus:~$ # I can see a list of background processes with 'jobs'
+peek@catus:~$ jobs
+[1]-  Running                 sleep 300 &
+[2]+  Stopped                 sleep 600
+peek@catus:~$ # Notice that the 'sleep 300' command is "running", but the 'sleep 600'
+peek@catus:~$ # command is "stopped".  I'll start the 'sleep 600' command running again:
+peek@catus:~$ bg
+[2]+ sleep 600 &
+peek@catus:~$ # Now I can do other things.
+peek@catus:~$ # But what if I want to connect with one of my background processes?
+peek@catus:~$ # I can bring a background process to the foreground with the 'fg' command.
+peek@catus:~$ # Using 'fg' by itself will bring to the foreground the last command
+peek@catus:~$ # I put in the background.
+peek@catus:~$ # If I want to bring some other process to the foreground, then I have to
+peek@catus:~$ # use the numerical identifier displayed on the left in square brackets.
+[1]-  Done                    sleep 300
+peek@catus:~$ # Ah!  Now I see that the 'sleep 300' command has finished!
+peek@catus:~$ # Well, now there's no point in bringing it to the foreground -- it's
+peek@catus:~$ # done and gone.
+peek@catus:~$ # I'll create another 10-minute sleep...
+peek@catus:~$ sleep 600 &
+[3] 15157
+peek@catus:~$ # I started this command of in the background right away.  Notice that
+peek@catus:~$ # it was given a new, unique job number, 3, even though there are now only
+peek@catus:~$ # two processes running in the background.
+peek@catus:~$ jobs
+[2]-  Running                 sleep 600 &
+[3]+  Running                 sleep 600 &
+peek@catus:~$ # I'll bring the first process to the foreground.
+peek@catus:~$ fg 2
+sleep 600
+^Z
+[2]+  Stopped                 sleep 600
+peek@catus:~$ # There.  I did it.  But then I got bored again, so I put it back in
+peek@catus:~$ # the background.  When these processes end, they will tell you so with
+peek@catus:~$ # a "Done", just like the 'sleep 300' above.  However, even after the
+peek@catus:~$ # program exits, you won't see the "Done" line until you press RETURN
+peek@catus:~$ # or enter another command.
+peek@catus:~$ # Oh yeah, the job number is local the shell.  It's not the same thing
+peek@catus:~$ # as the process identification number (PID).  The PID is something the
+peek@catus:~$ # kernel uses.  The difference is like "I'm in apartment B" versus
+peek@catus:~$ # "I live at 1234 Winston Way Apartments".  That is, if you view a shell
+peek@catus:~$ # as an apartment building, processes as residents, and the kernel as
+peek@catus:~$ # the city the apartment building is in.  But maybe that's just further
+peek@catus:~$ # confounding an already confusing concept...?
+[3]-  Done                    sleep 600
+peek@catus:~$ # Ah, well done.  I see that in the amount of time that it took for me
+peek@catus:~$ # to type, the last sleep command finished.
+peek@catus:~$ # Wait a second, wasn't there another sleep command that should have
+peek@catus:~$ # finished before that one?
+peek@catus:~$ jobs
+[2]+  Stopped                 sleep 600
+peek@catus:~$ bg
+[2]+ sleep 600 &
+peek@catus:~$ # Did you catch that mistake I made?  I said earlier that I had put job
+peek@catus:~$ # number 2 back into the background, but I forgot to actually type 'bg'.
+peek@catus:~$ # I've just wasted valuable time that I could have been using for
+peek@catus:~$ # something else.
+[2]+  Done                    sleep 600
+peek@catus:~$
 </code>
-In the above example, I type the command <code>ls</code> without any arguments.  The output is ls' default behavior.  But I can modify what ls does:
+====== Input, Output, and Redirection ======
+In computer programming, standard streams are input and output communications channels between a computer program and it's environment.  These streams are preconnected when the program begins it's execution.  There are three standard I/O channels that are available to every program: **standard input** (**stdin**), **standard output** (**stdout**), and **standard error** (**stderr**).
+Unless told otherwise, the operating system will assume that a program's stdin comes from the keyboard, and a program's stdout and stderr will go to the screen.  However, it's often useful to redirect input and output, or to connect the output of one program to the input of another.  This is called **redirection**.
+^ Command Line Argument ^ Redirection Type ^
+^ <code>></code> | Standard output redirection -- Output from the program is sent to the given file, pipe, or specified destination. <code>
+$ ls -ald /etc/passwd > /tmp/ls-output.txt
+$ cat /tmp/ls-output.txt
+-rw-r--r-- 1 root root 2803 May 10 16:40 /etc/passwd
+</code> |
+^ <code><</code> | Standard input redirection -- Input to the program is read from the given file, pipe, or specified source. <code>
+$ cat < /etc/passwd
+root:x:0:0:root:/root:/bin/bash
+daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
+bin:x:2:2:bin:/bin:/usr/sbin/nologin
+sys:x:3:3:sys:/dev:/usr/sbin/nologin
+[**> The rest of this output removed for brevity <**]
+</code> |
+^ <code>2></code> | Standard error redirection -- Output written to standard error is instead written to the given file, pipe, or specified destination. <code>
+$ ls -ald /this-file-does-not-exist 2> /tmp/ls-output.txt
+$ cat /tmp/ls-output.txt
+ls: cannot access '/this-file-does-not-exist': No such file or directory
+</code> |
+^ <code>|</code> | Pipe -- This is used to shuttle output from one command to another command's input. <code>
+$ cat /etc/passwd | wc -c
+
+$ cat /etc/passwd | wc -l
+
+$ cat /etc/passwd | wc --max-line-length
+
+</code> (This shows that my ''/etc/passwd'' file contains 2,803 bytes, and 51 lines.  The longest line is 87 characters.  These command line arguments and more can be found in the ''wc'' man page, or by typing ''wc --help'' (two dashes before "help").) |
+^ <code>2>&1</code> | Standard error redirection -- Stderr is written to wherever stdout goes.  For example, if writing output to a file, then this: <code><command> > logfile.txt 2> logfile.txt</code> Is functionally equivalent to this: <code><command> > logfile.txt 2>&1</code> |
+^ <code>>&2</code> | Standard output redirection -- Stdout is written to wherever stderr goes. |
+====== Variables and Environment Variables ======
+A variable is simply a mapping between a string name and a value.  In the shell, values can be strings or integers.  (Fractions and decimal values are treated like strings.)  Variables are created by naming the variable, followed immediately by an equal sign (no spaces), and the value.  If the value is to be a string with spaces, then the value needs to be wrapped in single or double quotes.
+For example:
 <code>
-peek@catus:~/Documents/LinuxTutorial$ ls -1
+peek@catus:~/Documents/Software$ echo "${v}"
-Aware_-_Kontinuum.flac
-Glaciers-SD.mp4
+peek@catus:~/Documents/Software$ v="Hello World"
-log-messages.txt
+peek@catus:~/Documents/Software$ echo "${v}"
-MendelMax_3_Full_Kit_Packing_Slip_-_Sheet1.csv
+Hello World
-README
 </code>
-This time I pass ls ''-1'', telling it that I want it to print everything in only one column.  This is especially handy for piping the output of ls to another command or script that will do something useful.
+**NOTE:**
+  * Here I'm introducing a new command, ''echo''.  This command will print out whatever you give it as an argument.  In addition to printing out the value of a variable, it's also very useful to use inside of scripts for giving the user feedback about what the script is doing.
+  * Also notice that while I assign the variable as ''<variablename>=<value>'', I must access the variable by pre-pending a dollar sign to it's name and using curly braces.  Actually, the curly braces are optional, but if you do any scripting for very long, then you'll find that using curly braces keeps things clean and bug-free.  So by introducing variable referencing to you with curly braces, I'm hoping you will avoid potential heartache down the road.
+In the first command I use the ''echo'' command try to print out the value of the ''v'' variable.  The shell has no ''v'' variable defined, so the shell prints out an empty line.  The second command I set a value to the variable ''v''.  And in the third command I print out the value of ''v'' again -- this time it works.
+Environment variables are just variables that the shell shares with any program that it executes.  To turn a variable into an environment variable, you only need to export it:
 <code>
-peek@catus:~/Documents/LinuxTutorial$ ls -al
+export v
-total 35620
-drwxrwx---  2 peek peek     4096 May 13 13:33 .
-drwxr-x--- 47 peek peek     4096 May 13 12:03 ..
--rw-rw----  1 peek peek 30725004 May 12 15:32 Aware_-_Kontinuum.flac
--rw-rw----  1 peek peek  2358156 May 12 15:29 Glaciers-SD.mp4
--rw-r-----  1 peek peek  3359600 May 12 16:02 log-messages.txt
--rw-rw----  1 peek peek     3188 May 12 16:12 MendelMax_3_Full_Kit_Packing_Slip_-_Sheet1.csv
--rw-rw----  1 peek peek      579 May 12 15:41 README
-peek@catus:~/Documents/LinuxTutorial$
 </code>
-The above shows an argument that will tell ls to show me a lot of extra information about the files in this directory.  Specifically, column one shows me permissions, columns 3 and 4 show me user and group ownership, column 5 shows me file size in bytes, and columns 6-8 shows me the date and time that the file or directory was last modified.
+Now, any program that is executed by the shell will be able to see and use the ''v'' variable.
-==== Configuration Files ====
+If you haven't guessed already, the system had a set of standard environment variables that are defined automatically.  Here's a list of the most common environment variables:
-Another method of passing information along to a program is by creating or editing a configuration file.  Configuration files are specific to the program that reads them, so specifics won't be given here.  Just being aware of their existence can make your life easier.
+^ Variable Name ^ Description ^
+^ HOME | The location of the user's home directory. <code>peek@catus:~$ echo "${HOME}"
+/home/peek
+</code> |
+^ PATH | A colon-separated list of directories in which to look for command programs.  For every command you type, the shell will search each of these directories in turn until it finds the command you want.  The first match is used, which means the order in which these directories appear is important.  Here's my ''PATH'' (which will differ from yours): <code>peek@catus:~$ echo "${PATH}"
+/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/loc
+al/games:/snap/bin:/home/peek/usr/Linux/x86_64/bin:/home/peek/usr/Linux/bin:/hom
+e/peek/usr/bin
+</code> |
+^ USER | Your user name. <code>
+peek@catus:~$ echo "${USER}"
+peek
+</code> |
-==== Environment Variables ====
+There are usually many, many more.  Some are standard and used on nearly every Unix implementation that exists (line ''HOME'', ''USER'', and ''PATH''), others may be non-standard and only exist on that certain machine.
-A third common method of passing information to a program is through environment variables.  These are special variables that are given to the shell and the shell will remember them and pass their values along to any scripts or programs that the shell executes.
+If you want to see a comprehensive list of all of the environment variables set in your shell, type the ''set'' command.  (Pro Tip: Pipe set to something like ''less'' so that you can actually read it before it scrolls off the top of the screen.)
+====== Combining Commands and Subshells ======
-===== Some Useful Commands =====
+===== Combining Commands On A Single Line =====
-^ **Command** ^ **Description** ^
+Commands can be combined on the same line by separating each command with a semicolon, like so:
-^ <code>pwd</code> | Returns the shell's current working directory. <code>$ pwd
-/home/peek</code> |
-^ <code>cd</code> | Used by itself, the ''cd'' command will return the user's current working directory to the top of their home area (usually ''/home/<username>'').  When used with a command line argument, the ''cd'' command will attempt to change the current working directory to the directory name passed to it as an argument. <code>$ pwd
-/home/peek/usr/bin
-$ cd
-$ pwd
-/home/peek
-$ cd /tmp
-$ pwd
-/tmp</code> |
-^ <code>ls</code> | Used by itself, the ''ls'' command will list the directory contents of the current working directory.  The output format can be modified with command line arguments -- and, optionally, the user may give a list of files or directories to list. <code>$ ls
-Aware_-_Kontinuum.flac	log-messages.txt				README
-Glaciers-SD.mp4		MendelMax_3_Full_Kit_Packing_Slip_-_Sheet1.csv
-$ ls -al
-total 35620
-drwxrwx---  2 peek peek     4096 May 13 13:33 .
-drwxr-x--- 47 peek peek     4096 May 13 12:03 ..
--rw-rw----  1 peek peek 30725004 May 12 15:32 Aware_-_Kontinuum.flac
--rw-rw----  1 peek peek  2358156 May 12 15:29 Glaciers-SD.mp4
--rw-r-----  1 peek peek  3359600 May 12 16:02 log-messages.txt
--rw-rw----  1 peek peek     3188 May 12 16:12 MendelMax_3_Full_Kit_Packing_Slip_-_Sheet1.csv
--rw-rw----  1 peek peek      579 May 12 15:41 README
-$ ls /tmp
-aws_root.log
-dvdcss-DnlBhJ
-dvdcss-q9Iio2
-hsperfdata_root
-ssh-8eL6NRDqQg
-ssh-gUTLY6GtJG
-systemd-private-6289425d307e43a98f4f52405e0d97ac-colord.service-IaXZv6
-systemd-private-6289425d307e43a98f4f52405e0d97ac-tor@default.service-CIBSIM
-$ ls /tmp /var/tmp
-/tmp:
-aws_root.log
-dvdcss-DnlBhJ
-dvdcss-q9Iio2
-hsperfdata_root
-ssh-8eL6NRDqQg
-ssh-gUTLY6GtJG
-systemd-private-6289425d307e43a98f4f52405e0d97ac-colord.service-IaXZv6
-systemd-private-6289425d307e43a98f4f52405e0d97ac-tor@default.service-CIBSIM
-/var/tmp:
+^ This ^ = ^ This ^
-systemd-private-6008fcbada2e41ed9bddfe6b012eb599-colord.service-c0jtLz
+| <code>ls -1
-systemd-private-6008fcbada2e41ed9bddfe6b012eb599-tor@default.service-B7IV7c
+cd ~/Desktop
-systemd-private-6289425d307e43a98f4f52405e0d97ac-colord.service-iHELip
+df .
-systemd-private-6289425d307e43a98f4f52405e0d97ac-tor@default.service-J0PeqU
+</code> | | <code>ls -1 ; cd ~/Desktop ; df .</code> |
+===== Splitting (A) Command(s) Across Multiple Lines =====
+Just like it's possible to combine commands, it's also possible to split commands.  Any line that ends with a ''\'' character is taken by the shell to mean that the command is incomplete, and that there will be more coming on the next line.  Usually you wouldn't do this for commands that you type yourself, but it's handy to use when writing shell scripts as it makes your script easier to read and understand.  For an example, see the command substitution section below.
+===== Executing Commands In A Subshell =====
+Commands can also be run in a subshell.  This means that the shell runs a copy of itself, and the copy executes your command.  Why would you want to do this?  Well, here's an example.  Say you want to time how long it takes to log into a set of remote machines with ssh and run a command:
+<code>peek@catus:~$ time ssh peek@alces01 "uptime" ; time ssh peek@alces02 "uptime" ; time ssh peek@alces03 "uptime"
+:09:03 up 4 days, 18:11,  0 users,  load average: 0.06, 0.07, 0.06
+real	0m0.591s
+user	0m0.020s
+sys	0m0.000s
+:09:04 up 4 days, 18:31,  0 users,  load average: 0.01, 0.02, 0.05
+real	0m0.667s
+user	0m0.020s
+sys	0m0.000s
+:09:04 up 4 days, 19:19,  0 users,  load average: 0.00, 0.01, 0.05
+real	0m0.673s
+user	0m0.020s
+sys	0m0.000s
+</code>
+That's all nice and fine, but if you want to know the total time to execute all three commands then you have to do some math.  Another method would be to run the three commands in a subshell, and the time the subshell:
+<code>
+peek@catus:~$ time (ssh peek@alces01 "uptime" ; ssh peek@alces02 "uptime" ; ssh peek@alces03 "uptime")
+:10:27 up 4 days, 18:12,  0 users,  load average: 0.12, 0.08, 0.06
+:10:27 up 4 days, 18:32,  0 users,  load average: 0.00, 0.01, 0.05
+:10:28 up 4 days, 19:20,  0 users,  load average: 0.04, 0.04, 0.05
+real	0m1.788s
+user	0m0.056s
+sys	0m0.004s
+</code>
+===== Command Substitution =====
+Command substitution allows the output of a command to replace the command name.  There are two forms:
+^ Old Form ^ New Form ^
+| <code>`<commands>`</code> | <code>$(<commands>)</code> |
+Why would you want to do this?  Earlier you saw how to send the output of one command to the input of another command with pipes.  But what if what you need is to take the output of one command and use it as a command line argument of another command?
+For example:
+<code>
+<command1> $(<command2> $(<command3>) )
+</code>
+This may not seem like much right now, but it becomes very powerful when you get into shell scripting.  Here's an example.  **NOTE: Don't worry if you don't understand what the code does!**  It might look intimidating for the uninitiated -- especially if this is your first trip into terminal-land.  It's just hard to relay how useful some of the shell's functions are without going deeper.  For now, just bask in it as a glorious example, and for those that want to know more, I'll go into details below:
+<code>
+peek@catus:~$ list_of_user_shells=$(for uid in $(seq 108 110); do grep "^[^:]*:[^:]*:${uid}:" /etc/passwd ; done | awk -F: '{print $7}' | sort | uniq)
+peek@catus:~$ echo "${list_of_user_shells}"
+/bin/false /usr/sbin/nologin
+</code>
+This line is long and ugly to look at.  I can break it up:
+<code>
+peek@catus:~$ list_of_user_shells=$(\
+> for uid in $(seq 108 110); do \
+>   grep "^[^:]*:[^:]*:${uid}:" /etc/passwd ; \
+> done \
+> | awk -F: '{print $7}' \
+> | sort \
+> | uniq \
+> )
+peek@catus:~$ echo "${list_of_user_shells}"
+/bin/false /usr/sbin/nologin
+</code>
+**NOTE:** The line breaks are to make the code more readable.  The ''>'' prompt is a sub-prompt printed by the shell, telling me that the shell understands that my ''\'' character on the end of my input denotes that I'm not done entering my command.  You wouldn't actually type the ''>'' character yourself.
+What does this command do?  It searches through ''/etc/passwd'' searching for any user with a user ID number between 108 and 110 inclusively, then pulls from their user record what their login shell is, puts the login shells into a list, sorts the list, and then removes duplicate entries.  Here's a breakdown:
+^ Command ^ Description ^
+| <code>seq 108 110</code> | This command prints out all integers between the two integers listed on it's command line arguments, inclusively.  Ex: <code>$ seq 108 110
+
+
+
+</code> |
+| <code>for uid in $(seq 108 110); do \
+   ... ; \
+done</code> | This command reads in the integers output by the ''seq'' command and loops over each one, assigning each number to the variable ''uid'' and then executing the commands between ''do'' and ''done''.  For Ex: <code>$ for uid in $(seq 108 110); do \
+  echo "PROCESSING UID: ${uid}" ; \
+done
+PROCESSING UID: 108
+PROCESSING UID: 109
+PROCESSING UID: 110
+</code> |
+| <code>for uid in $(seq 108 110); do \
+  grep "^[^:]*:[^:]*:${uid}:" /etc/passwd ; \
+done</code> | This will run a ''grep'' command for every integer value of ''${uid}'' from 108 to 110.  The ''grep'' command will pull out the user record for the user whose UID matches the value stored in ''${uid}''. Ex: <code>
+$ for uid in $(seq 108 110); do \
+>   grep "^[^:]*:[^:]*:${uid}:" /etc/passwd ; \
+> done
+sshd:x:108:65534::/var/run/sshd:/usr/sbin/nologin
+colord:x:109:116:colord colour management daemon,,,:
+/var/lib/colord:/bin/false
+statd:x:110:65534::/var/lib/nfs:/bin/false
+</code> (Note: Line wrapped for readability) |
+| <code>for uid in $(seq 108 110); do \
+  grep "^[^:]*:[^:]*:${uid}:" /etc/passwd ; \
+done \
+| awk -F: '{print $7}'
+</code> | We only want to extract the shell from the user record.  Here's where understanding the user record will come in handy.  The format of ''/etc/passwd'' is such that each line is a separate record, and each field of the record is separated by a colon.  The user's shell is stored in the 7th field of the record.  The ''awk'' command here tells awk that the field separator is a colon, and that we want to print out field number 7. Ex: <code>$ for uid in $(seq 108 110); do \
+>   grep "^[^:]*:[^:]*:${uid}:" /etc/passwd ; \
+> done \
+> | awk -F: '{print $7}'
+/usr/sbin/nologin
+/bin/false
+/bin/false
 </code> |
-^ <code>file</code> | Tells you what type of data is in a file. <code>$ ls -1
+| <code>for uid in $(seq 108 110); do \
-Aware_-_Kontinuum.flac
+  grep "^[^:]*:[^:]*:${uid}:" /etc/passwd ; \
-Glaciers-SD.mp4
+done \
-log-messages.txt
+| awk -F: '{print $7}' \
-MendelMax_3_Full_Kit_Packing_Slip_-_Sheet1.csv
+| sort \
-README
+| uniq
-$ file README
+</code> | In building our list of shells, we don't want duplicate entries.  There are two entries for ''/bin/false''.  We can use ''sort'' and ''uniq'' to get rid of these extra entries. |
-README: ASCII text
+| <code>list_of_user_shells=$(\
-$ file --mime-type *
+  for uid in $(seq 108 110); do \
-Aware_-_Kontinuum.flac:                         audio/x-flac
+    grep "^[^:]*:[^:]*:${uid}:" /etc/passwd ; \
-Glaciers-SD.mp4:                                video/mp4
+  done \
-log-messages.txt:                               text/plain
+  | awk -F: '{print $7}' \
-MendelMax_3_Full_Kit_Packing_Slip_-_Sheet1.csv: text/plain
+  | sort \
-README:                                         text/plain</code> |
+  | uniq \
-^ <code>less</code> | Lets you view a file's contents.  If the file's contents are longer than a single page on your terminal, then less will pause and wait for you to press the space bar before it continues with the next page.  Less will also allow you to perform searches <code>$ less README
+  )</code> | Finally, this last bit wraps the entire command into a sub-shell command substitution.  The shell will take the output from the entire command and place it into the variable ''list_of_user_shells'', which we can use later. |
-This is a standard README file.  It's a plain text file that tells you
+====== Exit Codes ======
-something about the contents of the other files and directories located here.
-Readme files aren't always available but it's nice when they are.
-Attribution:
+Whenever a program exits it returns an exit code to the shell.  An exit code of 0 means that the program exited normally.  A non-zero exit code means that an error occurred.  This is useful information for building conditional commands that may change behavior depending on what errors arise.  For instance, the ''make'' command will execute a list of commands in a file named ''Makefile'', and exit the first time it encounters an error.  Makefiles are often used to generate programs and content.  But for now, it's sufficient for you to know that exit codes exist and that they are useful.
-* Glaciers-SD.mp4
-        Science World: How do you protect glaciers?
-        Released by: Polyester Studio
-        Date: April 28, 2016 (approx.)
-  URL: https://vimeo.com/164133990
-* Aware_-_Kontinuum.flac
+====== Playing Around ======
-  Kontinuum - Aware [NCS Release]
-        Released by: NoCopyrightSounds
-        Date: May 27, 2015
-        URL: https://soundcloud.com/nocopyrightsounds/kontinuum-aware-ncs-release
-~
+^ Make a safe place to play around ||
-~
+^ Type: | <code>$ mkdir /tmp/playground
-~
+$ cd /tmp/playground
-~
-README (END)
 </code> |
-^ <code>mkdir</code> <code>rmdir</code> | Create or remove a directory. <code>$ ls -ald ERASEME
+^ Get a text file to play around with ||
-ls: cannot access 'ERASEME': No such file or directory
+^ Type: | <code>$ wget -O file.txt 'http://ocw.mit.edu/ans7870/6/6.006/s08/lecturenotes/files/t8.shakespeare.txt'
-$ mkdir ERASEME
+--2016-06-07 09:25:57--  http://ocw.mit.edu/ans7870/6/6.006/s08/lecturenotes/files/t8.shakespea
-$ ls -ald ERASEME
+re.txt
-drwxrwx--- 2 peek peek 4096 May 16 14:04 ERASEME
+Resolving ocw.mit.edu (ocw.mit.edu)... 23.15.135.8, 23.15.135.19
-$ rmdir ERASEME
+Connecting to ocw.mit.edu (ocw.mit.edu)|23.15.135.8|:80... connected.
-$ ls -ald ERASEME
+HTTP request sent, awaiting response... 200 OK
-ls: cannot access 'ERASEME': No such file or directory</code> |
+Length: 5458199 (5.2M) [text/plain]
-^ <code>cp</code> | Copy a file to a new filename, or to a new directory. <code>
+Saving to: ‘file.txt’
-$ mkdir ERASME
-$ ls -ald ERASEME/*
+%[======================================>] 5,458,199   1.58MB/s   in 3.3s
-ls: cannot access 'ERASEME/*': No such file or directory
-$ cp README ERASEME/
+-06-07 09:26:00 (1.58 MB/s) - ‘file.txt’ saved [5458199/5458199]
-$ ls -ald ERASEME/*
--rw-rw---- 1 peek peek 579 May 16 14:15 ERASEME/README
-$ cp README ERASEME/mullets-are-a-way-of-life
-$ ls -ald ERASEME/*
--rw-rw---- 1 peek peek 579 May 16 14:16 ERASEME/mullets-are-a-way-of-life
--rw-rw---- 1 peek peek 579 May 16 14:15 ERASEME/README
 </code> |
-^ <code>rm</code> | Remove a file (**NOTE: THERE IS NO GOING BACK!**) <code>
+^ How many lines are in the file? ||
-$ ls -ald ERASEME/*
+^ Type: | <code>$ wc -l file.txt
--rw-rw---- 1 peek peek 579 May 16 14:16 ERASEME/mullets-are-a-way-of-life
+file.txt
--rw-rw---- 1 peek peek 579 May 16 14:15 ERASEME/README
-$ rm ERASEME/mullets-are-a-way-of-life
-$ ls -ald ERASEME/*
--rw-rw---- 1 peek peek 579 May 16 14:15 ERASEME/README
 </code> |
-^ <code>mv</code> | Move or rename a file. <code>
+^ How many words are in the file? ||
-$ ls -ald ERASEME/*
+^ Type: | <code>$ wc -w file.txt
--rw-rw---- 1 peek peek 579 May 16 14:15 ERASEME/README
+file.txt
-$ mv ERASEME/README ERASEME/duck-duck-goose
-$ ls -ald ERASEME/*
--rw-rw---- 1 peek peek 579 May 16 14:15 ERASEME/duck-duck-goose
-$ mkdir ERASEME/subdirectory
-$ ls -ald ERASEME/*
--rw-rw---- 1 peek peek  579 May 16 14:15 ERASEME/duck-duck-goose
-drwxrwx--- 2 peek peek 4096 May 16 14:24 ERASEME/subdirectory
-$ mv ERASEME/duck-duck-goose ERASEME/subdirectory
-$ ls -ald ERASEME/*
-drwxrwx--- 2 peek peek 4096 May 16 14:25 ERASEME/subdirectory
-$ ls -ald ERASEME/subdirectory/*
--rw-rw---- 1 peek peek 579 May 16 14:15 ERASEME/subdirectory/duck-duck-goose
 </code> |
-^ <code>find</code> | Locates a file or directory somewhere inside either the current directory or a subdirectory.  Find is a **very** powerful tool, and I won't cover it's use in depth here, but I introduce it so that you know what I'm talking about when I use it later.  When run without any command line arguments, find will print out a recursive listing of all files and directories located in the current directory and below.  That's good enough for now. |
+^ What are the first 10 lines of this file? ||
-^ <code>cat</code> | Copies input to output. |
+^ Type: | <code>$ head -10 file.txt
-^ <code>wc</code> | Counts characters, words, and lines. |
+This is the 100th Etext file presented by Project Gutenberg, and
-^ <code>head -<n> </code> | Prints the first ''<n>'' lines of input, then ignores the rest. |
+is presented in cooperation with World Library, Inc., from their
-^ <code>tail -<n> </code> | Prints the last ''<n>'' lines of input, then ignores the rest. |
+Library of the Future and Shakespeare CDROMS.  Project Gutenberg
+often releases Etexts that are NOT placed in the Public Domain!!
-===== Playing Around =====
+Shakespeare
-Now that you have a few commands to play with, let's play with them:
+*This Etext has certain copyright implications you should read!*
-^ Type: | <code>$ mkdir /tmp/playground</code> |
+<<THIS ELECTRONIC VERSION OF THE COMPLETE WORKS OF WILLIAM
-^ Type: | <code>$ find /tmp/playground
-/tmp/playground</code |
-^ Type: | <code>$ cd /tmp/playground</code> |
-^ Type: | <code>pwd
-/tmp/playground</code> |
-^ Type: | <code>$ mkdir dir1 dir2</code> |
-^ Type: | <code>$ cp /etc/passwd .</code> |
-^ Type: | <code>$ find
-.
-./passwd
-./dir1
-./dir2
 </code> |
-^ Type: | <code>mv passwd fun</code> |
+^ What is the 3rd word on each line of the last ten lines? ||
-^ Type: | <code>$ find
+^ Type: | <code>$ cat file.txt \
-.
+> | awk '{print $3}' \
-./dir1
+> | tail -10
-./dir2
+ONLY,
-./fun
+COMMERCIAL
+CHARGES
+this
 </code> |
-^ Type: | <code>mv fun dir1</code> |
+^ What are the top 10 most frequently used words? ||
-^ Type: | <code>$ find
+^ Type: | <code>$ cat file.txt \
-.
+> | awk '{a[$1]++}END{for(k in a)print a[k],k}' RS=" |\n" \
-./dir1
+> | sort -nr \
-./dir1/fun
+> | head -10
-./dir2
+the
+I
+and
+to
+of
+a
+my
+in
+you
 </code> |
-^ Type: | <code>mv dir1 dir2</code> |
+^ NOTE: How would you know to do that!?!?  The easiest way is to just search online for someone who's already done it, and then copy what they typed.  There are several online forums for command line usage too.  That's what I did.  Awk is so powerful, I've only scratched the surface of it myself. ||
-^ Type: | <code>$ find
+^ In the file '/etc/passwd', what is the 8th line? ||
-.
+^ Type: | <code>$ cat /etc/passwd | head -8 | tail -1
-./dir2
+lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
-./dir2/dir1
-./dir2/dir1/fun
 </code> |
-^ Type: | <code>cd</code> |
+^ The lines in /etc/passwd are fields separated by a colon.  What is the value in the 5th field? ||
-^ Type: | <code>rmdir /tmp/playground
+^ Type: | <code>$ cat /etc/passwd | head -8 | tail -1 | awk -F: '{print $5}'
-rmdir: failed to remove '/tmp/playground': Directory not empty
+lp
 </code> |
-^ Type: | <code>rm /tmp/playground/dir2/dir1/fun</code> |
+^ The 3rd field is the User ID number (UID).  What is the sum of all UIDs in '/etc/passwd'? ||
-^ Type: | <code>find /tmp/playground
+^ Type: | <code>$ n=0
-/tmp/playground
+$ cat /etc/passwd \
-/tmp/playground/dir2
+> | awk -F: '{print $3}' \
-/tmp/playground/dir2/dir1
+> | while read d ; do let n=$(( $n + $d )) ; done
+$ echo $n
 </code> |
-^ Type: | <code>rmdir /tmp/playground/dir2/dir1</code> |
+^ | NOTE: That didn't work!  Why?  Because the while loop executes in a subshell, and while it is possible to pass values of exported values from parent shell to child subshell, the child gets a copy and not the original variable.  This means that when the child process sums up values for 'n', that value is lost when the child process exits.  Since the parent's version of 'n' never changes, it's value is still zero. |
-^ Type: | <code>rmdir /tmp/playground/dir2</code> |
+^ So what's the correct way to do it?  Here's one way that works: ||
-^ Type: | <code>rmdir /tmp/playground</code> |
+^ Type: | <code>$ n=0
+$ for d in $(cat /etc/passwd | awk -F: '{print $3}') ; do \
+> n=$(( $n + $d )) ; \
+> done
+$ echo $n
+
+</code> |
+^ | NOTE: The for loop doesn't execute in a subshell.  How would you know this?  Well, reading the bash manual is probably the best way.  :-/ |

NIMBioS Help

User Tools

Site Tools

Differences

Page Tools