30 July, 2020
This article was originally published in DigitalOcean’s public knowledge base. It has been reproduced here with some minor edits.
Python 3 includes the subprocess
module for running external programs and reading their outputs in your Python code.
You might find subprocess
useful if you want to use another program on your computer from within your Python code. For example, you might want to invoke git
from within your Python code to retrieve files in your project that are tracked in git
version control. Since any program you can access on your computer can be controlled by subprocess
, the examples shown here will be applicable to any external program you might want to invoke from your Python code.
subprocess
includes several classes and functions, but in this tutorial we’ll cover one of subprocess
’s most useful functions: subprocess.run
. We’ll review its different uses and main keyword arguments.
To get the most out of this tutorial, it is recommended to have some familiarity with programming in Python 3. You can review these tutorials for the necessary background information:
You can use the subprocess.run
function to run an external program from your Python code. First, though, you need to import the subprocess
and sys
modules into your program:
import subprocess
import sys
result = subprocess.run([sys.executable, "-c", "print('ocean')"])
If you run this, you will receive output like the following:
ocean
Let’s review this example:
sys.executable
is the absolute path to the Python executable that your program was originally invoked with. For example, sys.executable
might be a path like /usr/local/bin/python
.subprocess.run
is given a list of strings consisting of the components of the command we are trying to run. Since the first string we pass is sys.executable
, we are instructing subprocess.run
to execute a new Python program.-c
component is a python
command line option that allows you to pass a string with an entire Python program to execute. In our case, we pass a program that prints the string ocean
.You can think of each entry in the list that we pass to subprocess.run
as being separated by a space. For example, [sys.executable, "-c", "print('ocean')"]
translates roughly to /usr/local/bin/python -c "print('ocean')"
. Note that subprocess
automatically quotes the components of the command before trying to run them on the underlying operating system so that, for example, you can pass a filename that has spaces in it.
Warning: Never pass untrusted input to
subprocess.run
. Sincesubprocess.run
has the ability to perform arbitrary commands on your computer, malicious actors can use it to manipulate your computer in unexpected ways.
Now that we can invoke an external program using subprocess.run
, let’s see how we can capture output from that program. For example, this process could be useful if we wanted to use git ls-files
to output all your files currently stored under version control.
Note: The examples shown in this section require Python 3.7 or higher. In particular, the
capture_output
andtext
keyword arguments were added in Python 3.7 when it was released in June 2018.
Let’s add to our previous example:
import subprocess
import sys
result = subprocess.run(
[sys.executable, "-c", "print('ocean')"], capture_output=True, text=True
)
print("stdout:", result.stdout)
print("stderr:", result.stderr)
If we run this code, we’ll receive output like the following:
stdout: ocean
stderr:
This example is largely the same as the one introduced in the first section: we are still running a subprocess to print ocean
. Importantly, however, we pass the capture_output=True
and text=True
keyword arguments to subprocess.run
.
subprocess.run
returns a subprocess.CompletedProcess
object that is bound to result
. The subprocess.CompletedProcess
object includes details about the external program’s exit code and its output. capture_output=True
ensures that result.stdout
and result.stderr
are filled in with the corresponding output from the external program. By default, result.stdout
and result.stderr
are bound as bytes, but the text=True
keyword argument instructs Python to instead decode the bytes into strings.
In the output section, stdout
is ocean
(plus the trailing newline that print
adds implicitly), and we have no stderr
.
Let’s try an example that produces a non-empty value for stderr
:
import subprocess
import sys
result = subprocess.run(
[sys.executable, "-c", "raise ValueError('oops')"], capture_output=True, text=True
)
print("stdout:", result.stdout)
print("stderr:", result.stderr)
If we run this code, we receive output like the following:
stdout:
stderr: Traceback (most recent call last):
File "<string>", line 1, in <module>
ValueError: oops
This code runs a Python subprocess that immediately raises a ValueError
. When we inspect the final result
, we see nothing in stdout
and a Traceback
of our ValueError
in stderr
. This is because by default Python writes the Traceback
of the unhandled exception to stderr
.
Sometimes it’s useful to raise an exception if a program we run exits with a bad exit code. Programs that exit with a zero code are considered successful, but programs that exit with a non-zero code are considered to have encountered an error. As an example, this pattern could be useful if we wanted to raise an exception in the event that we run git ls-files
in a directory that wasn’t actually a git
repository.
We can use the check=True
keyword argument to subprocess.run
to have an exception raised if the external program returns a non-zero exit code:
import subprocess
import sys
result = subprocess.run([sys.executable, "-c", "raise ValueError('oops')"], check=True)
If we run this code, we receive output like the following:
Traceback (most recent call last):
File "<string>", line 1, in <module>
ValueError: oops
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.8/subprocess.py", line 512, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/usr/local/bin/python', '-c', "raise ValueError('oops')"]' returned non-zero exit status 1.
This output shows that we ran a subprocess that raised an error, which is printed in stderr
in our terminal. Then subprocess.run
dutifully raised a subprocess.CalledProcessError
on our behalf in our main Python program.
Alternatively, the subprocess
module also includes the subprocess.CompletedProcess.check_returncode
method, which we can invoke for similar effect:
import subprocess
import sys
result = subprocess.run([sys.executable, "-c", "raise ValueError('oops')"])
result.check_returncode()
If we run this code, we’ll receive:
Traceback (most recent call last):
File "<string>", line 1, in <module>
ValueError: oops
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.8/subprocess.py", line 444, in check_returncode
raise CalledProcessError(self.returncode, self.args, self.stdout,
subprocess.CalledProcessError: Command '['/usr/local/bin/python', '-c', "raise ValueError('oops')"]' returned non-zero exit status 1.
Since we didn’t pass check=True
to subprocess.run
, we successfully bound a subprocess.CompletedProcess
instance to result
even though our program exited with a non-zero code. Calling result.check_returncode()
, however, raises a subprocess.CalledProcessError
because it detects the completed process exited with a bad code.
subprocess.run
includes the timeout
argument to allow you to stop an external program if it is taking too long to execute:
import subprocess
import sys
result = subprocess.run([sys.executable, "-c", "import time; time.sleep(2)"], timeout=1)
If we run this code, we’ll receive output like the following:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.8/subprocess.py", line 491, in run
stdout, stderr = process.communicate(input, timeout=timeout)
File "/usr/local/lib/python3.8/subprocess.py", line 1024, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/usr/local/lib/python3.8/subprocess.py", line 1892, in _communicate
self.wait(timeout=self._remaining_time(endtime))
File "/usr/local/lib/python3.8/subprocess.py", line 1079, in wait
return self._wait(timeout=timeout)
File "/usr/local/lib/python3.8/subprocess.py", line 1796, in _wait
raise TimeoutExpired(self.args, timeout)
subprocess.TimeoutExpired: Command '['/usr/local/bin/python', '-c', 'import time; time.sleep(2)']' timed out after 0.9997982999999522 seconds
The subprocess we tried to run used the time.sleep
function to sleep for 2
seconds. However, we passed the timeout=1
keyword argument to subprocess.run
to time out our subprocess after 1
second. This explains why our call to subprocess.run
ultimately raised a subprocess.TimeoutExpired
exception.
Note that the timeout
keyword argument to subprocess.run
is approximate. Python will make a best effort to kill the subprocess after the timeout
number of seconds, but it won’t necessarily be exact.
Sometimes programs expect input to be passed to them via stdin
.
The input
keyword argument to subprocess.run
allows you to pass data to the stdin
of the subprocess. For example:
import subprocess
import sys
result = subprocess.run(
[sys.executable, "-c", "import sys; print(sys.stdin.read())"], input=b"underwater"
)
We’ll receive output like the following after running this code:
underwater
In this case, we passed the bytes underwater
to input
. Our target subprocess used sys.stdin
to read the passed in stdin
(underwater
) and printed it out in our output.
The input
keyword argument can be useful if you want to chain multiple subprocess.run
calls together passing the output of one program as the input to another.
The subprocess
module is a powerful part of the Python standard library that lets you run external programs and inspect their outputs easily. In this tutorial, you have learned to use subprocess.run
to control external programs, pass input to them, parse their output, and check their return codes.
The subprocess
module exposes additional classes and utilities that we did not cover in this tutorial. Now that you have a baseline, you can use the subprocess
module’s documentation to learn more about other available classes and utilities.
Editor: Kathryn Hancox