https://gist.github.com/bgoonz/dd7fd80df384fba134b42f256298bdb4#file-python-cheatsheet-py
Table of Contents
So let’s dive into the best Python cheat sheets recommended by us.
This is the best single cheat sheet. It uses every inch of the page to deliver value and covers everything you need to know to go from beginner to intermediate. Topics covered include container types, conversions, modules, maths, conditionals and formatting to name a few. A highly recommended 2-page sheet!
Some might think this cheat sheet is a bit lengthy. At 26 pages, it is the most comprehensive cheat sheets out there. It explains variables, data structures, exceptions, and classes – to name just a few. If you want the most thorough cheat sheet, pick this one.
Some of the most popular things to do with Python are Data Science and Machine Learning. In this cheat sheet, you will learn the basics of Python and the most important scientific library: NumPy (Numerical Python). You’ll learn the basics plus the most important NumPy functions. If you are using Python for Data Science, download this cheat sheet.
This Python data science cheat sheet from DataCamp is all about getting data into your code. Think about it: importing data is one of the most important tasks when working with data. Increasing your skills in this area will make you a better data scientist—and a better coder overall!
This cheat sheet is for more advanced learners. It covers class, string and list methods as well as system calls from the sys module. Once you’re comfortable defining basic classes and command line interfaces (CLIs), get this cheat sheet. It will take you to another level.
Want to learn Python well, but don’t have much time? Then this course is for you. It contains 5 carefully designed PDF cheat sheets. Each cheat sheet takes you one step further into the rabbit hole. You will learn practical Python concepts from the hand-picked examples and code snippets. The topics include basic keywords, simple and complex data types, crucial string and list methods, and powerful Python one-liners. If you lead a busy life and do not want to compromise on quality, this is the cheat sheet course for you!
The wonderful team at Dataquest have put together this comprehensive beginners cheat sheet. It covers all the basic data types, looping and reading files. It’s beautifully designed and is the first of a series.
This intermediate-level cheat sheet is a follow-up of the other Dataquest cheat sheet. It contains intermediate dtype methods, looping and handling errors.
NumPy is at the heart of data science. Advanced libraries like scikit-learn, Tensorflow, Pandas, and Matplotlib all built on NumPy arrays. You need to understand NumPy before you can thrive in data science and machine learning. The topics of this cheat sheet are creating arrays, combining arrays, scalar math, vector math and statistics.
This is only one great NumPy cheat sheet—if you want to get more, check out our article on the 10 best NumPy cheat sheets!
Want to master the visualization library Bokeh? This cheat sheet is for you! It contains all the basic Bokeh commands to get your beautiful visualizations going fast!
Pandas is everywhere. If you want to master “the Excel library for Python coders”, why not start with this cheat sheet? It’ll get you started fast and introduces the most important Pandas functions to you.
You can find a best-of article about the 7 best Pandas Cheat Sheets here.
Regex to the rescue! Regular expressions are wildly important for anyone who handles large amounts of text programmatically (ask Google). This cheat sheet introduces the most important Regex commands for quick reference. Download & master these regular expressions!
This cheat sheet is the most concise Python cheat sheet in the world. It contains keywords, basic data structures, and complex data structures—all in a single 1-page PDF file. If you’re lazy, this cheat sheet is a must!
If you love cheat sheets, here are some interesting references for you (lots of more PDF downloads):
From Highest to Lowest precedence:
Examples of expressions in the interactive shell:
String concatenation:
Note: Avoid +
operator for string concatenation. Prefer string formatting.
String Replication:
You can name a variable anything as long as it obeys the following rules:
It can be only one word.
It can use only letters, numbers, and the underscore (_
) character.
It can’t begin with a number.
Variable name starting with an underscore (_
) are considered as "unuseful`.
Example:
_spam
should not be used again in the code.
Inline comment:
Multiline comment:
Code with a comment:
Please note the two spaces in front of the comment.
Function docstring:
Example Code:
Evaluates to the integer value of the number of characters in a string:
Note: test of emptiness of strings, lists, dictionary, etc, should not use len, but prefer direct boolean evaluation.
Integer to String or Float:
Float to Integer:
These operators evaluate to True or False depending on the values you give them.
Examples:
Never use ==
or !=
operator to evaluate boolean operation. Use the is
or is not
operators, or use implicit boolean evaluation.
NO (even if they are valid Python):
YES (even if they are valid Python):
These statements are equivalent:
And these as well:
There are three Boolean operators: and, or, and not.
The and Operator’s Truth Table:
The or Operator’s Truth Table:
The not Operator’s Truth Table:
You can also use multiple Boolean operators in an expression, along with the comparison operators:
If the execution reaches a break statement, it immediately exits the while loop’s clause:
When the program execution reaches a continue statement, the program execution immediately jumps back to the start of the loop.
The range() function can also be called with three arguments. The first two arguments will be the start and stop values, and the third will be the step argument. The step is the amount that the variable is increased by after each iteration.
You can even use a negative number for the step argument to make the for loop count down instead of up.
This allows to specify a statement to execute in case of the full loop has been executed. Only useful when a break
condition can occur in the loop:
When creating a function using the def statement, you can specify what the return value should be with a return statement. A return statement consists of the following:
The return keyword.
The value or expression that the function should return.
Note: never compare to None
with the ==
operator. Always use is
.
Code in the global scope cannot use any local variables.
However, a local scope can access global variables.
Code in a function’s local scope cannot use variables in any other local scope.
You can use the same name for different variables if they are in different scopes. That is, there can be a local variable named spam and a global variable also named spam.
If you need to modify a global variable from within a function, use the global statement:
There are four rules to tell whether a variable is in a local scope or global scope:
If a variable is being used in the global scope (that is, outside of all functions), then it is always a global variable.
If there is a global statement for that variable in a function, it is a global variable.
Otherwise, if the variable is used in an assignment statement in the function, it is a local variable.
But if the variable is not used in an assignment statement, it is a global variable.
Code inside the finally
section is always executed, no matter if an exception has been raised or not, and even if an exception is not caught.
Slicing the complete list will perform a copy:
The multiple assignment trick is a shortcut that lets you assign multiple variables with the values in a list in one line of code. So instead of doing this:
You could type this line of code:
The multiple assignment trick can also be used to swap the values in two variables:
Examples:
append():
insert():
If the value appears multiple times in the list, only the first instance of the value will be removed.
You can also pass True for the reverse keyword argument to have sort() sort the values in reverse order:
If you need to sort the values in regular alphabetical order, pass str. lower for the key keyword argument in the sort() method call:
You can use the built-in function sorted
to return a new list:
The main way that tuples are different from lists is that tuples, like strings, are immutable.
Example Dictionary:
values():
keys():
items():
Using the keys(), values(), and items() methods, a for loop can iterate over the keys, values, or key-value pairs in a dictionary, respectively.
Get has two parameters: key and default value if the key did not exist
Let's consider this code:
Using setdefault
we could write the same code more succinctly:
A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.
There are two ways to create sets: using curly braces {}
and the built-in function set()
When creating an empty set, be sure to not use the curly braces {}
or you will get an empty dictionary instead.
A set automatically remove all the duplicate values.
And as an unordered data type, they can't be indexed.
Using the add()
method we can add a single element to the set.
And with update()
, multiple ones .
Both methods will remove an element from the set, but remove()
will raise a key error
if the value doesn't exist.
discard()
won't raise any errors.
union()
or |
will create a new set that contains all the elements from the sets provided.
intersection
or &
will return a set containing only the elements that are common to all of them.
difference
or -
will return only the elements that are unique to the first set (invoked set).
symetric_difference
or ^
will return all the elements that are not common between them.
The module standardizes a core set of fast, memory efficient tools that are useful by themselves or in combination. Together, they form an “iterator algebra” making it possible to construct specialized tools succinctly and efficiently in pure Python.
The itertools module comes in the standard library and must be imported.
Makes an iterator that returns the results of a function.
Example:
The operator.mul takes two numbers and multiplies them:
Passing a function is optional:
If no function is designated the items will be summed:
Takes an iterable and a integer. This will create all the unique combination that have r members.
Example:
Just like combinations(), but allows individual elements to be repeated more than once.
Example:
Makes an iterator that returns evenly spaced values starting with number start.
Example:
This function cycles through an iterator endlessly.
Example:
When reached the end of the iterable it start over again from the beginning.
Take a series of iterables and return them as one long iterable.
Example:
Filters one iterable with another.
Example:
Make an iterator that drops elements from the iterable as long as the predicate is true; afterwards, returns every element.
Example:
Makes an iterator that filters elements from iterable returning only those for which the predicate is False.
Example:
Simply put, this function groups things together.
Example:
This function is very much like slices. This allows you to cut out a piece of an iterable.
Example:
Example:
Creates the cartesian products from a series of iterables.
This function will repeat an object over and over again. Unless, there is a times argument.
Example:
Makes an iterator that computes the function using arguments obtained from the iterable.
Example:
The opposite of dropwhile(). Makes an iterator and returns elements from the iterable as long as the predicate is true.
Example:
Return n independent iterators from a single iterable.
Example:
Makes an iterator that aggregates elements from each of the iterables. If the iterables are of uneven length, missing values are filled-in with fillvalue. Iteration continues until the longest iterable is exhausted.
Example:
A List comprehension can be generated from a dictionary:
Example:
A raw string completely ignores all escape characters and prints any backslash that appears in the string.
Note: mostly used for regular expression definition (see re
package)
To keep a nicer flow in your code, you can use the dedent
function from the textwrap
standard package.
This generates the same string than before.
Slicing:
upper()
and lower()
:
isupper() and islower():
isalpha() returns True if the string consists only of letters and is not blank.
isalnum() returns True if the string consists only of letters and numbers and is not blank.
isdecimal() returns True if the string consists only of numeric characters and is not blank.
isspace() returns True if the string consists only of spaces,tabs, and new-lines and is not blank.
istitle() returns True if the string consists only of words that begin with an uppercase letter followed by only lowercase letters.
join():
split():
rjust() and ljust():
An optional second argument to rjust() and ljust() will specify a fill character other than a space character. Enter the following into the interactive shell:
center():
We can use the %x
format specifier to convert an int value to a string:
Python 3 introduced a new way to do string formatting that was later back-ported to Python 2.7. This makes the syntax for string formatting more regular.
The formatting operations described here exhibit a variety of quirks that lead to a number of common errors (such as failing to display tuples and dictionaries correctly). Using the newer formatted string literals or the str.format() interface helps avoid these errors. These alternatives also provide more powerful, flexible and extensible approaches to formatting text.
You would only use %s
string formatting on functions that can do lazy parameters evaluation, the most common being logging:
Prefer:
Over:
Or:
It is even possible to do inline arithmetic with it:
A simpler and less powerful mechanism, but it is recommended when handling format strings generated by users. Due to their reduced complexity template strings are a safer choice.
Import the regex module with import re
.
Create a Regex object with the re.compile()
function. (Remember to use a raw string.)
Pass the string you want to search into the Regex object’s search()
method. This returns a Match
object.
Call the Match object’s group()
method to return a string of the actual matched text.
All the regex functions in Python are in the re module:
To retrieve all the groups at once: use the groups() method—note the plural form for the name.
The | character is called a pipe. You can use it anywhere you want to match one of many expressions. For example, the regular expression r'Batman|Tina Fey' will match either 'Batman' or 'Tina Fey'.
You can also use the pipe to match one of several patterns as part of your regex:
The ? character flags the group that precedes it as an optional part of the pattern.
The * (called the star or asterisk) means “match zero or more”—the group that precedes the star can occur any number of times in the text.
While * means “match zero or more,” the + (or plus) means “match one or more”. The group preceding a plus must appear at least once. It is not optional:
If you have a group that you want to repeat a specific number of times, follow the group in your regex with a number in curly brackets. For example, the regex (Ha){3} will match the string 'HaHaHa', but it will not match 'HaHa', since the latter has only two repeats of the (Ha) group.
Instead of one number, you can specify a range by writing a minimum, a comma, and a maximum in between the curly brackets. For example, the regex (Ha){3,5} will match 'HaHaHa', 'HaHaHaHa', and 'HaHaHaHaHa'.
Python’s regular expressions are greedy by default, which means that in ambiguous situations they will match the longest string possible. The non-greedy version of the curly brackets, which matches the shortest string possible, has the closing curly bracket followed by a question mark.
In addition to the search() method, Regex objects also have a findall() method. While search() will return a Match object of the first matched text in the searched string, the findall() method will return the strings of every match in the searched string.
To summarize what the findall() method returns, remember the following:
When called on a regex with no groups, such as \d-\d\d\d-\d\d\d\d, the method findall() returns a list of ng matches, such as ['415-555-9999', '212-555-0000'].
When called on a regex that has groups, such as (\d\d\d)-(d\d)-(\d\d\d\d), the method findall() returns a list of es of strings (one string for each group), such as [('415', '555', '9999'), ('212', '555', '0000')].
There are times when you want to match a set of characters but the shorthand character classes (\d, \w, \s, and so on) are too broad. You can define your own character class using square brackets. For example, the character class [aeiouAEIOU] will match any vowel, both lowercase and uppercase.
You can also include ranges of letters or numbers by using a hyphen. For example, the character class [a-zA-Z0-9] will match all lowercase letters, uppercase letters, and numbers.
By placing a caret character (^) just after the character class’s opening bracket, you can make a negative character class. A negative character class will match all the characters that are not in the character class. For example, enter the following into the interactive shell:
You can also use the caret symbol (^) at the start of a regex to indicate that a match must occur at the beginning of the searched text.
Likewise, you can put a dollar sign (\$) at the end of the regex to indicate the string must end with this regex pattern.
And you can use the ^ and \$ together to indicate that the entire string must match the regex—that is, it’s not enough for a match to be made on some subset of the string.
The r'^Hello' regular expression string matches strings that begin with 'Hello':
The r'\d\$' regular expression string matches strings that end with a numeric character from 0 to 9:
The . (or dot) character in a regular expression is called a wildcard and will match any character except for a newline:
The dot-star uses greedy mode: It will always try to match as much text as possible. To match any and all text in a nongreedy fashion, use the dot, star, and question mark (.*?). The question mark tells Python to match in a nongreedy way:
The dot-star will match everything except a newline. By passing re.DOTALL as the second argument to re.compile(), you can make the dot character match all characters, including the newline character:
To make your regex case-insensitive, you can pass re.IGNORECASE or re.I as a second argument to re.compile():
The sub() method for Regex objects is passed two arguments:
The first argument is a string to replace any matches.
The second is the string for the regular expression.
The sub() method returns a string with the substitutions applied:
Another example:
To tell the re.compile() function to ignore whitespace and comments inside the regular expression string, “verbose mode” can be enabled by passing the variable re.VERBOSE as the second argument to re.compile().
Now instead of a hard-to-read regular expression like this:
you can spread the regular expression over multiple lines with comments like this:
There are two main modules in Python that deals with path manipulation. One is the os.path
module and the other is the pathlib
module. The pathlib
module was added in Python 3.4, offering an object-oriented way to handle file system paths.
On Windows, paths are written using backslashes (\
) as the separator between folder names. On Unix based operating system such as macOS, Linux, and BSDs, the forward slash (/
) is used as the path separator. Joining paths can be a headache if your code needs to work on different platforms.
Fortunately, Python provides easy ways to handle this. We will showcase how to deal with this with both os.path.join
and pathlib.Path.joinpath
Using os.path.join
on Windows:
And using pathlib
on *nix:
pathlib
also provides a shortcut to joinpath using the /
operator:
Notice the path separator is different between Windows and Unix based operating system, that's why you want to use one of the above methods instead of adding strings together to join paths together.
Joining paths is helpful if you need to create different file paths under the same directory.
Using os.path.join
on Windows:
Using pathlib
on *nix:
Using os
on Windows:
Using pathlib
on *nix:
Using os
on Windows:
Using pathlib
on *nix:
Oh no, we got a nasty error! The reason is that the 'delicious' directory does not exist, so we cannot make the 'walnut' and the 'waffles' directories under it. To fix this, do:
And all is good :)
There are two ways to specify a file path.
An absolute path, which always begins with the root folder
A relative path, which is relative to the program’s current working directory
There are also the dot (.) and dot-dot (..) folders. These are not real folders but special names that can be used in a path. A single period (“dot”) for a folder name is shorthand for “this directory.” Two periods (“dot-dot”) means “the parent folder.”
To see if a path is an absolute path:
Using os.path
on *nix:
Using pathlib
on *nix:
You can extract an absolute path with both os.path
and pathlib
Using os.path
on *nix:
Using pathlib
on *nix:
You can get a relative path from a starting path to another path.
Using os.path
on *nix:
Using pathlib
on *nix:
Checking if a file/directory exists:
Using os.path
on *nix:
Using pathlib
on *nix:
Checking if a path is a file:
Using os.path
on *nix:
Using pathlib
on *nix:
Checking if a path is a directory:
Using os.path
on *nix:
Using pathlib
on *nix:
Getting a file's size in bytes:
Using os.path
on Windows:
Using pathlib
on *nix:
Listing directory contents using os.listdir
on Windows:
Listing directory contents using pathlib
on *nix:
To find the total size of all the files in this directory:
WARNING: Directories themselves also have a size! So you might want to check for whether a path is a file or directory using the methods in the methods discussed in the above section!
Using os.path.getsize()
and os.listdir()
together on Windows:
Using pathlib
on *nix:
The shutil module provides functions for copying files, as well as entire folders.
While shutil.copy() will copy a single file, shutil.copytree() will copy an entire folder and every folder and file contained in it:
The destination path can also specify a filename. In the following example, the source file is moved and renamed:
If there is no eggs folder, then move() will rename bacon.txt to a file named eggs.
Calling os.unlink(path) or Path.unlink() will delete the file at path.
Calling os.rmdir(path) or Path.rmdir() will delete the folder at path. This folder must be empty of any files or folders.
Calling shutil.rmtree(path) will remove the folder at path, and all files and folders it contains will also be deleted.
You can install this module by running pip install send2trash from a Terminal window.
To read/write to a file in Python, you will want to use the with
statement, which will close the file for you after you are done.
To save variables:
To open and read variables:
Just like dictionaries, shelf values have keys() and values() methods that will return list-like values of the keys and values in the shelf. Since these methods return list-like values instead of true lists, you should pass them to the list() function to get them in list form.
The extractall() method for ZipFile objects extracts all the files and folders from a ZIP file into the current working directory.
The extract() method for ZipFile objects will extract a single file from the ZIP file. Continue the interactive shell example:
This code will create a new ZIP file named new.zip that has the compressed contents of spam.txt.
Open a JSON file with:
Write a JSON file with:
Compared to JSON, YAML allows for much better human maintainability and gives you the option to add comments. It is a convenient choice for configuration files where humans will have to edit it.
There are two main libraries allowing to access to YAML files:
Install them using pip install
in your virtual environment.
The first one it easier to use but the second one, Ruamel, implements much better the YAML specification, and allow for example to modify a YAML content without altering comments.
Open a YAML file with:
Install it with:
Usage:
Exceptions are raised with a raise statement. In code, a raise statement consists of the following:
The raise keyword
A call to the Exception() function
A string with a helpful error message passed to the Exception() function
Often it’s the code that calls the function, not the function itself, that knows how to handle an exception. So you will commonly see a raise statement inside a function and the try and except statements in the code calling the function.
The traceback is displayed by Python whenever a raised exception goes unhandled. But can also obtain it as a string by calling traceback.format_exc(). This function is useful if you want the information from an exception’s traceback but also want an except statement to gracefully handle the exception. You will need to import Python’s traceback module before calling this function.
The 116 is the return value from the write() method, since 116 characters were written to the file. The traceback text was written to errorInfo.txt.
An assertion is a sanity check to make sure your code isn’t doing something obviously wrong. These sanity checks are performed by assert statements. If the sanity check fails, then an AssertionError exception is raised. In code, an assert statement consists of the following:
The assert keyword
A condition (that is, an expression that evaluates to True or False)
A comma
A string to display when the condition is False
In plain English, an assert statement says, “I assert that this condition holds true, and if not, there is a bug somewhere in the program.” Unlike exceptions, your code should not handle assert statements with try and except; if an assert fails, your program should crash. By failing fast like this, you shorten the time between the original cause of the bug and when you first notice the bug. This will reduce the amount of code you will have to check before finding the code that’s causing the bug.
Disabling Assertions
Assertions can be disabled by passing the -O option when running Python.
To enable the logging module to display log messages on your screen as your program runs, copy the following to the top of your program (but under the #! python shebang line):
Say you wrote a function to calculate the factorial of a number. In mathematics, factorial 4 is 1 × 2 × 3 × 4, or 24. Factorial 7 is 1 × 2 × 3 × 4 × 5 × 6 × 7, or 5,040. Open a new file editor window and enter the following code. It has a bug in it, but you will also enter several log messages to help yourself figure out what is going wrong. Save the program as factorialLog.py.
Logging levels provide a way to categorize your log messages by importance. There are five logging levels, described in Table 10-1 from least to most important. Messages can be logged at each level using a different logging function.
After you’ve debugged your program, you probably don’t want all these log messages cluttering the screen. The logging.disable() function disables these so that you don’t have to go into your program and remove all the logging calls by hand.
Instead of displaying the log messages to the screen, you can write them to a text file. The logging.basicConfig() function takes a filename keyword argument, like so:
This function:
Is equivalent to the lambda function:
It's not even need to bind it to a name like add before:
Like regular nested functions, lambdas also work as lexical closures:
Note: lambda can only evaluate an expression, like a single line of code.
Many programming languages have a ternary operator, which define a conditional expression. The most common usage is to make a terse simple conditional assignment statement. In other words, it offers one-line code to evaluate the first expression if the condition is true, otherwise it evaluates the second expression.
Example:
Ternary operators can be chained:
The code above is equivalent to:
The names args and kwargs
are arbitrary - the important thing are the *
and **
operators. They can mean:
In a function declaration, *
means “pack all remaining positional arguments into a tuple named <name>
”, while **
is the same for keyword arguments (except it uses a dictionary, not a tuple).
In a function call, *
means “unpack tuple or list named <name>
to positional arguments at this position”, while **
is the same for keyword arguments.
For example you can make a function that you can use to call any other function, no matter what parameters it has:
Inside forward, args is a tuple (of all positional arguments except the first one, because we specified it - the f), kwargs is a dict. Then we call f and unpack them so they become normal arguments to f.
You use *args
when you have an indefinite amount of positional arguments.
Similarly, you use **kwargs
when you have an indefinite number of keyword arguments.
Functions can accept a variable number of positional arguments by using *args
in the def statement.
You can use the items from a sequence as the positional arguments for a function with the *
operator.
Using the *
operator with a generator may cause your program to run out of memory and crash.
Adding new positional parameters to functions that accept *args
can introduce hard-to-find bugs.
Function arguments can be specified by position or by keyword.
Keywords make it clear what the purpose of each argument is when it would be confusing with only positional arguments.
Keyword arguments with default values make it easy to add new behaviors to a function, especially when the function has existing callers.
Optional keyword arguments should always be passed by keyword instead of by position.
While Python's context managers are widely used, few understand the purpose behind their use. These statements, commonly used with reading and writing files, assist the application in conserving system memory and improve resource management by ensuring specific resources are only in use for certain processes.
A context manager is an object that is notified when a context (a block of code) starts and ends. You commonly use one with the with statement. It takes care of the notifying.
For example, file objects are context managers. When a context ends, the file object is closed automatically:
Anything that ends execution of the block causes the context manager's exit method to be called. This includes exceptions, and can be useful when an error causes you to prematurely exit from an open file or connection. Exiting a script without properly closing files/connections is a bad idea, that may cause data loss or other problems. By using a context manager you can ensure that precautions are always taken to prevent damage or loss in this way.
It is also possible to write a context manager using generator syntax thanks to the contextlib.contextmanager
decorator:
__main__
Top-level script environment__main__
is the name of the scope in which top-level code executes. A module’s name is set equal to __main__
when read from standard input, a script, or from an interactive prompt.
A module can discover whether or not it is running in the main scope by checking its own __name__
, which allows a common idiom for conditionally executing code in a module when it is run as a script or with python -m
but not when it is imported:
For a package, the same effect can be achieved by including a main.py module, the contents of which will be executed when the module is run with -m
For example we are developing script which is designed to be used as module, we should do:
Every Python module has it’s __name__
defined and if this is __main__
, it implies that the module is being run standalone by the user and we can do corresponding appropriate actions.
If you import this script as a module in another script, the name is set to the name of the script/module.
Python files can act as either reusable modules, or as standalone programs.
if __name__ == “main”:
is used to execute some code only if the file was run directly, and not imported.
The setup script is the centre of all activity in building, distributing, and installing modules using the Distutils. The main purpose of the setup script is to describe your module distribution to the Distutils, so that the various commands that operate on your modules do the right thing.
The setup.py
file is at the heart of a Python project. It describes all of the metadata about your project. There a quite a few fields you can add to a project to give it a rich set of metadata describing the project. However, there are only three required fields: name, version, and packages. The name field must be unique if you wish to publish your package on the Python Package Index (PyPI). The version field keeps track of different releases of the project. The packages field describes where you’ve put the Python source code within your project.
This allows you to easily install Python packages. Often it's enough to write:
and module will install itself.
Our initial setup.py will also include information about the license and will re-use the README.txt file for the long_description field. This will look like:
Dataclasses
are python classes but are suited for storing data objects. This module provides a decorator and functions for automatically adding generated special methods such as __init__()
and __repr__()
to user-defined classes.
They store data and represent a certain data type. Ex: A number. For people familiar with ORMs, a model instance is a data object. It represents a specific kind of entity. It holds attributes that define or represent the entity.
They can be compared to other objects of the same type. Ex: A number can be greater than, less than, or equal to another number.
Python 3.7 provides a decorator dataclass that is used to convert a class into a dataclass.
python 2.7
with dataclass
It is easy to add default values to the fields of your data class.
It is mandatory to define the data type in dataclass. However, If you don't want specify the datatype then, use typing.Any
.
The use of a Virtual Environment is to test python code in encapsulated environments and to also avoid filling the base Python installation with libraries we might use for only one project.
Install virtualenv
Install virtualenvwrapper-win (Windows)
Usage:
Make a Virtual Environment
Anything we install now will be specific to this project. And available to the projects we connect to this environment.
Set Project Directory
To bind our virtualenv with our current working directory we simply enter:
Deactivate
To move onto something else in the command line type ‘deactivate’ to deactivate your environment.
Notice how the parenthesis disappear.
Workon
Open up the command prompt and type ‘workon HelloWold’ to activate the environment and move into your root project folder
Install Poetry
Create a new project
This will create a my-project directory:
The pyproject.toml file will orchestrate your project and its dependencies:
Packages
To add dependencies to your project, you can specify them in the tool.poetry.dependencies section:
Also, instead of modifying the pyproject.toml file by hand, you can use the add command and it will automatically find a suitable version constraint.
To install the dependencies listed in the pyproject.toml:
To remove dependencies:
Install pipenv
Enter your Project directory and install the Packages for your project
Pipenv will install your package and create a Pipfile for you in your project’s directory. The Pipfile is used to track which dependencies your project needs in case you need to re-install them.
Uninstall Packages
Activate the Virtual Environment associated with your Python project
Exit the Virtual Environment
Where packages, notebooks, projects and environments are shared. Your place for free public conda package hosting.
Usage:
Make a Virtual Environment
To use the Virtual Environment, activate it by:
Anything installed now will be specific to the project HelloWorld
Exit the Virtual Environment
Long time Pythoneer Tim Peters succinctly channels the BDFL's guiding principles for Python's design into 20 aphorisms, only 19 of which have been written down.
From Highest to Lowest precedence:
Examples of expressions in the interactive shell:
String concatenation:
Note: Avoid +
operator for string concatenation. Prefer string formatting.
String Replication:
You can name a variable anything as long as it obeys the following rules:
It can be only one word.
It can use only letters, numbers, and the underscore (_
) character.
It can’t begin with a number.
Variable name starting with an underscore (_
) are considered as "unuseful`.
Example:
_spam
should not be used again in the code.
Inline comment:
Multiline comment:
Code with a comment:
Please note the two spaces in front of the comment.
Function docstring:
Example Code:
Evaluates to the integer value of the number of characters in a string:
Note: test of emptiness of strings, lists, dictionary, etc, should not use len, but prefer direct boolean evaluation.
Integer to String or Float:
Float to Integer:
These operators evaluate to True or False depending on the values you give them.
Examples:
Never use ==
or !=
operator to evaluate boolean operation. Use the is
or is not
operators, or use implicit boolean evaluation.
NO (even if they are valid Python):
YES (even if they are valid Python):
These statements are equivalent:
And these as well:
There are three Boolean operators: and, or, and not.
The and Operator’s Truth Table:
The or Operator’s Truth Table:
The not Operator’s Truth Table:
You can also use multiple Boolean operators in an expression, along with the comparison operators:
If the execution reaches a break statement, it immediately exits the while loop’s clause:
When the program execution reaches a continue statement, the program execution immediately jumps back to the start of the loop.
The range() function can also be called with three arguments. The first two arguments will be the start and stop values, and the third will be the step argument. The step is the amount that the variable is increased by after each iteration.
You can even use a negative number for the step argument to make the for loop count down instead of up.
This allows to specify a statement to execute in case of the full loop has been executed. Only useful when a break
condition can occur in the loop:
When creating a function using the def statement, you can specify what the return value should be with a return statement. A return statement consists of the following:
The return keyword.
The value or expression that the function should return.
Note: never compare to None
with the ==
operator. Always use is
.
Code in the global scope cannot use any local variables.
However, a local scope can access global variables.
Code in a function’s local scope cannot use variables in any other local scope.
You can use the same name for different variables if they are in different scopes. That is, there can be a local variable named spam and a global variable also named spam.
If you need to modify a global variable from within a function, use the global statement:
There are four rules to tell whether a variable is in a local scope or global scope:
If a variable is being used in the global scope (that is, outside of all functions), then it is always a global variable.
If there is a global statement for that variable in a function, it is a global variable.
Otherwise, if the variable is used in an assignment statement in the function, it is a local variable.
But if the variable is not used in an assignment statement, it is a global variable.
Code inside the finally
section is always executed, no matter if an exception has been raised or not, and even if an exception is not caught.
Slicing the complete list will perform a copy:
The multiple assignment trick is a shortcut that lets you assign multiple variables with the values in a list in one line of code. So instead of doing this:
You could type this line of code:
The multiple assignment trick can also be used to swap the values in two variables:
Examples:
append():
insert():
If the value appears multiple times in the list, only the first instance of the value will be removed.
You can also pass True for the reverse keyword argument to have sort() sort the values in reverse order:
If you need to sort the values in regular alphabetical order, pass str. lower for the key keyword argument in the sort() method call:
You can use the built-in function sorted
to return a new list:
The main way that tuples are different from lists is that tuples, like strings, are immutable.
Example Dictionary:
values():
keys():
items():
Using the keys(), values(), and items() methods, a for loop can iterate over the keys, values, or key-value pairs in a dictionary, respectively.
Get has two parameters: key and default value if the key did not exist
Let's consider this code:
Using setdefault
we could write the same code more succinctly:
A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.
There are two ways to create sets: using curly braces {}
and the built-in function set()
When creating an empty set, be sure to not use the curly braces {}
or you will get an empty dictionary instead.
A set automatically remove all the duplicate values.
And as an unordered data type, they can't be indexed.
Using the add()
method we can add a single element to the set.
And with update()
, multiple ones .
Both methods will remove an element from the set, but remove()
will raise a key error
if the value doesn't exist.
discard()
won't raise any errors.
union()
or |
will create a new set that contains all the elements from the sets provided.
intersection
or &
will return a set containing only the elements that are common to all of them.
difference
or -
will return only the elements that are unique to the first set (invoked set).
symetric_difference
or ^
will return all the elements that are not common between them.
The module standardizes a core set of fast, memory efficient tools that are useful by themselves or in combination. Together, they form an “iterator algebra” making it possible to construct specialized tools succinctly and efficiently in pure Python.
The itertools module comes in the standard library and must be imported.
Makes an iterator that returns the results of a function.
Example:
The operator.mul takes two numbers and multiplies them:
Passing a function is optional:
If no function is designated the items will be summed:
Takes an iterable and a integer. This will create all the unique combination that have r members.
Example:
Just like combinations(), but allows individual elements to be repeated more than once.
Example:
Makes an iterator that returns evenly spaced values starting with number start.
Example:
This function cycles through an iterator endlessly.
Example:
When reached the end of the iterable it start over again from the beginning.
Take a series of iterables and return them as one long iterable.
Example:
Filters one iterable with another.
Example:
Make an iterator that drops elements from the iterable as long as the predicate is true; afterwards, returns every element.
Example:
Makes an iterator that filters elements from iterable returning only those for which the predicate is False.
Example:
Simply put, this function groups things together.
Example:
This function is very much like slices. This allows you to cut out a piece of an iterable.
Example:
Example:
Creates the cartesian products from a series of iterables.
This function will repeat an object over and over again. Unless, there is a times argument.
Example:
Makes an iterator that computes the function using arguments obtained from the iterable.
Example:
The opposite of dropwhile(). Makes an iterator and returns elements from the iterable as long as the predicate is true.
Example:
Return n independent iterators from a single iterable.
Example:
Makes an iterator that aggregates elements from each of the iterables. If the iterables are of uneven length, missing values are filled-in with fillvalue. Iteration continues until the longest iterable is exhausted.
Example:
A List comprehension can be generated from a dictionary:
Example:
A raw string completely ignores all escape characters and prints any backslash that appears in the string.
Note: mostly used for regular expression definition (see re
package)
To keep a nicer flow in your code, you can use the dedent
function from the textwrap
standard package.
This generates the same string than before.
Slicing:
upper()
and lower()
:
isupper() and islower():
isalpha() returns True if the string consists only of letters and is not blank.
isalnum() returns True if the string consists only of letters and numbers and is not blank.
isdecimal() returns True if the string consists only of numeric characters and is not blank.
isspace() returns True if the string consists only of spaces,tabs, and new-lines and is not blank.
istitle() returns True if the string consists only of words that begin with an uppercase letter followed by only lowercase letters.
join():
split():
rjust() and ljust():
An optional second argument to rjust() and ljust() will specify a fill character other than a space character. Enter the following into the interactive shell:
center():
We can use the %x
format specifier to convert an int value to a string:
Python 3 introduced a new way to do string formatting that was later back-ported to Python 2.7. This makes the syntax for string formatting more regular.
The formatting operations described here exhibit a variety of quirks that lead to a number of common errors (such as failing to display tuples and dictionaries correctly). Using the newer formatted string literals or the str.format() interface helps avoid these errors. These alternatives also provide more powerful, flexible and extensible approaches to formatting text.
You would only use %s
string formatting on functions that can do lazy parameters evaluation, the most common being logging:
Prefer:
Over:
Or:
It is even possible to do inline arithmetic with it:
A simpler and less powerful mechanism, but it is recommended when handling format strings generated by users. Due to their reduced complexity template strings are a safer choice.
Import the regex module with import re
.
Create a Regex object with the re.compile()
function. (Remember to use a raw string.)
Pass the string you want to search into the Regex object’s search()
method. This returns a Match
object.
Call the Match object’s group()
method to return a string of the actual matched text.
All the regex functions in Python are in the re module:
To retrieve all the groups at once: use the groups() method—note the plural form for the name.
The | character is called a pipe. You can use it anywhere you want to match one of many expressions. For example, the regular expression r'Batman|Tina Fey' will match either 'Batman' or 'Tina Fey'.
You can also use the pipe to match one of several patterns as part of your regex:
The ? character flags the group that precedes it as an optional part of the pattern.
The * (called the star or asterisk) means “match zero or more”—the group that precedes the star can occur any number of times in the text.
While * means “match zero or more,” the + (or plus) means “match one or more”. The group preceding a plus must appear at least once. It is not optional:
If you have a group that you want to repeat a specific number of times, follow the group in your regex with a number in curly brackets. For example, the regex (Ha){3} will match the string 'HaHaHa', but it will not match 'HaHa', since the latter has only two repeats of the (Ha) group.
Instead of one number, you can specify a range by writing a minimum, a comma, and a maximum in between the curly brackets. For example, the regex (Ha){3,5} will match 'HaHaHa', 'HaHaHaHa', and 'HaHaHaHaHa'.
Python’s regular expressions are greedy by default, which means that in ambiguous situations they will match the longest string possible. The non-greedy version of the curly brackets, which matches the shortest string possible, has the closing curly bracket followed by a question mark.
In addition to the search() method, Regex objects also have a findall() method. While search() will return a Match object of the first matched text in the searched string, the findall() method will return the strings of every match in the searched string.
To summarize what the findall() method returns, remember the following:
When called on a regex with no groups, such as \d-\d\d\d-\d\d\d\d, the method findall() returns a list of ng matches, such as ['415-555-9999', '212-555-0000'].
When called on a regex that has groups, such as (\d\d\d)-(d\d)-(\d\d\d\d), the method findall() returns a list of es of strings (one string for each group), such as [('415', '555', '9999'), ('212', '555', '0000')].
There are times when you want to match a set of characters but the shorthand character classes (\d, \w, \s, and so on) are too broad. You can define your own character class using square brackets. For example, the character class [aeiouAEIOU] will match any vowel, both lowercase and uppercase.
You can also include ranges of letters or numbers by using a hyphen. For example, the character class [a-zA-Z0-9] will match all lowercase letters, uppercase letters, and numbers.
By placing a caret character (^) just after the character class’s opening bracket, you can make a negative character class. A negative character class will match all the characters that are not in the character class. For example, enter the following into the interactive shell:
You can also use the caret symbol (^) at the start of a regex to indicate that a match must occur at the beginning of the searched text.
Likewise, you can put a dollar sign ($) at the end of the regex to indicate the string must end with this regex pattern.
And you can use the ^ and $ together to indicate that the entire string must match the regex—that is, it’s not enough for a match to be made on some subset of the string.
The r'^Hello' regular expression string matches strings that begin with 'Hello':
The r'\d$' regular expression string matches strings that end with a numeric character from 0 to 9:
The . (or dot) character in a regular expression is called a wildcard and will match any character except for a newline:
The dot-star uses greedy mode: It will always try to match as much text as possible. To match any and all text in a nongreedy fashion, use the dot, star, and question mark (.*?). The question mark tells Python to match in a nongreedy way:
The dot-star will match everything except a newline. By passing re.DOTALL as the second argument to re.compile(), you can make the dot character match all characters, including the newline character:
To make your regex case-insensitive, you can pass re.IGNORECASE or re.I as a second argument to re.compile():
The sub() method for Regex objects is passed two arguments:
The first argument is a string to replace any matches.
The second is the string for the regular expression.
The sub() method returns a string with the substitutions applied:
Another example:
To tell the re.compile() function to ignore whitespace and comments inside the regular expression string, “verbose mode” can be enabled by passing the variable re.VERBOSE as the second argument to re.compile().
Now instead of a hard-to-read regular expression like this:
you can spread the regular expression over multiple lines with comments like this:
There are two main modules in Python that deals with path manipulation. One is the os.path
module and the other is the pathlib
module. The pathlib
module was added in Python 3.4, offering an object-oriented way to handle file system paths.
On Windows, paths are written using backslashes (\
) as the separator between folder names. On Unix based operating system such as macOS, Linux, and BSDs, the forward slash (/
) is used as the path separator. Joining paths can be a headache if your code needs to work on different platforms.
Fortunately, Python provides easy ways to handle this. We will showcase how to deal with this with both os.path.join
and pathlib.Path.joinpath
Using os.path.join
on Windows:
And using pathlib
on *nix:
pathlib
also provides a shortcut to joinpath using the /
operator:
Notice the path separator is different between Windows and Unix based operating system, that's why you want to use one of the above methods instead of adding strings together to join paths together.
Joining paths is helpful if you need to create different file paths under the same directory.
Using os.path.join
on Windows:
Using pathlib
on *nix:
Using os
on Windows:
Using pathlib
on *nix:
Using os
on Windows:
Using pathlib
on *nix:
Oh no, we got a nasty error! The reason is that the 'delicious' directory does not exist, so we cannot make the 'walnut' and the 'waffles' directories under it. To fix this, do:
And all is good :)
There are two ways to specify a file path.
An absolute path, which always begins with the root folder
A relative path, which is relative to the program’s current working directory
There are also the dot (.) and dot-dot (..) folders. These are not real folders but special names that can be used in a path. A single period (“dot”) for a folder name is shorthand for “this directory.” Two periods (“dot-dot”) means “the parent folder.”
To see if a path is an absolute path:
Using os.path
on *nix:
Using pathlib
on *nix:
You can extract an absolute path with both os.path
and pathlib
Using os.path
on *nix:
Using pathlib
on *nix:
You can get a relative path from a starting path to another path.
Using os.path
on *nix:
Using pathlib
on *nix:
Checking if a file/directory exists:
Using os.path
on *nix:
Using pathlib
on *nix:
Checking if a path is a file:
Using os.path
on *nix:
Using pathlib
on *nix:
Checking if a path is a directory:
Using os.path
on *nix:
Using pathlib
on *nix:
Getting a file's size in bytes:
Using os.path
on Windows:
Using pathlib
on *nix:
Listing directory contents using os.listdir
on Windows:
Listing directory contents using pathlib
on *nix:
To find the total size of all the files in this directory:
WARNING: Directories themselves also have a size! So you might want to check for whether a path is a file or directory using the methods in the methods discussed in the above section!
Using os.path.getsize()
and os.listdir()
together on Windows:
Using pathlib
on *nix:
The shutil module provides functions for copying files, as well as entire folders.
While shutil.copy() will copy a single file, shutil.copytree() will copy an entire folder and every folder and file contained in it:
The destination path can also specify a filename. In the following example, the source file is moved and renamed:
If there is no eggs folder, then move() will rename bacon.txt to a file named eggs.
Calling os.unlink(path) or Path.unlink() will delete the file at path.
Calling os.rmdir(path) or Path.rmdir() will delete the folder at path. This folder must be empty of any files or folders.
Calling shutil.rmtree(path) will remove the folder at path, and all files and folders it contains will also be deleted.
You can install this module by running pip install send2trash from a Terminal window.
To read/write to a file in Python, you will want to use the with
statement, which will close the file for you after you are done.
To save variables:
To open and read variables:
Just like dictionaries, shelf values have keys() and values() methods that will return list-like values of the keys and values in the shelf. Since these methods return list-like values instead of true lists, you should pass them to the list() function to get them in list form.
The extractall() method for ZipFile objects extracts all the files and folders from a ZIP file into the current working directory.
The extract() method for ZipFile objects will extract a single file from the ZIP file. Continue the interactive shell example:
This code will create a new ZIP file named new.zip that has the compressed contents of spam.txt.
Open a JSON file with:
Write a JSON file with:
Compared to JSON, YAML allows for much better human maintainability and gives you the option to add comments. It is a convenient choice for configuration files where humans will have to edit it.
There are two main libraries allowing to access to YAML files:
Install them using pip install
in your virtual environment.
The first one it easier to use but the second one, Ruamel, implements much better the YAML specification, and allow for example to modify a YAML content without altering comments.
Open a YAML file with:
Install it with:
Usage:
Exceptions are raised with a raise statement. In code, a raise statement consists of the following:
The raise keyword
A call to the Exception() function
A string with a helpful error message passed to the Exception() function
Often it’s the code that calls the function, not the function itself, that knows how to handle an exception. So you will commonly see a raise statement inside a function and the try and except statements in the code calling the function.
The traceback is displayed by Python whenever a raised exception goes unhandled. But can also obtain it as a string by calling traceback.format_exc(). This function is useful if you want the information from an exception’s traceback but also want an except statement to gracefully handle the exception. You will need to import Python’s traceback module before calling this function.
The 116 is the return value from the write() method, since 116 characters were written to the file. The traceback text was written to errorInfo.txt.
An assertion is a sanity check to make sure your code isn’t doing something obviously wrong. These sanity checks are performed by assert statements. If the sanity check fails, then an AssertionError exception is raised. In code, an assert statement consists of the following:
The assert keyword
A condition (that is, an expression that evaluates to True or False)
A comma
A string to display when the condition is False
In plain English, an assert statement says, “I assert that this condition holds true, and if not, there is a bug somewhere in the program.” Unlike exceptions, your code should not handle assert statements with try and except; if an assert fails, your program should crash. By failing fast like this, you shorten the time between the original cause of the bug and when you first notice the bug. This will reduce the amount of code you will have to check before finding the code that’s causing the bug.
Disabling Assertions
Assertions can be disabled by passing the -O option when running Python.
To enable the logging module to display log messages on your screen as your program runs, copy the following to the top of your program (but under the #! python shebang line):
Say you wrote a function to calculate the factorial of a number. In mathematics, factorial 4 is 1 × 2 × 3 × 4, or 24. Factorial 7 is 1 × 2 × 3 × 4 × 5 × 6 × 7, or 5,040. Open a new file editor window and enter the following code. It has a bug in it, but you will also enter several log messages to help yourself figure out what is going wrong. Save the program as factorialLog.py.
Logging levels provide a way to categorize your log messages by importance. There are five logging levels, described in Table 10-1 from least to most important. Messages can be logged at each level using a different logging function.
After you’ve debugged your program, you probably don’t want all these log messages cluttering the screen. The logging.disable() function disables these so that you don’t have to go into your program and remove all the logging calls by hand.
Instead of displaying the log messages to the screen, you can write them to a text file. The logging.basicConfig() function takes a filename keyword argument, like so:
This function:
Is equivalent to the lambda function:
It's not even need to bind it to a name like add before:
Like regular nested functions, lambdas also work as lexical closures:
Note: lambda can only evaluate an expression, like a single line of code.
Many programming languages have a ternary operator, which define a conditional expression. The most common usage is to make a terse simple conditional assignment statement. In other words, it offers one-line code to evaluate the first expression if the condition is true, otherwise it evaluates the second expression.
Example:
Ternary operators can be chained:
The code above is equivalent to:
The names args and kwargs
are arbitrary - the important thing are the *
and **
operators. They can mean:
In a function declaration, *
means “pack all remaining positional arguments into a tuple named <name>
”, while **
is the same for keyword arguments (except it uses a dictionary, not a tuple).
In a function call, *
means “unpack tuple or list named <name>
to positional arguments at this position”, while **
is the same for keyword arguments.
For example you can make a function that you can use to call any other function, no matter what parameters it has:
Inside forward, args is a tuple (of all positional arguments except the first one, because we specified it - the f), kwargs is a dict. Then we call f and unpack them so they become normal arguments to f.
You use *args
when you have an indefinite amount of positional arguments.
Similarly, you use **kwargs
when you have an indefinite number of keyword arguments.
Functions can accept a variable number of positional arguments by using *args
in the def statement.
You can use the items from a sequence as the positional arguments for a function with the *
operator.
Using the *
operator with a generator may cause your program to run out of memory and crash.
Adding new positional parameters to functions that accept *args
can introduce hard-to-find bugs.
Function arguments can be specified by position or by keyword.
Keywords make it clear what the purpose of each argument is when it would be confusing with only positional arguments.
Keyword arguments with default values make it easy to add new behaviors to a function, especially when the function has existing callers.
Optional keyword arguments should always be passed by keyword instead of by position.
While Python's context managers are widely used, few understand the purpose behind their use. These statements, commonly used with reading and writing files, assist the application in conserving system memory and improve resource management by ensuring specific resources are only in use for certain processes.
A context manager is an object that is notified when a context (a block of code) starts and ends. You commonly use one with the with statement. It takes care of the notifying.
For example, file objects are context managers. When a context ends, the file object is closed automatically:
Anything that ends execution of the block causes the context manager's exit method to be called. This includes exceptions, and can be useful when an error causes you to prematurely exit from an open file or connection. Exiting a script without properly closing files/connections is a bad idea, that may cause data loss or other problems. By using a context manager you can ensure that precautions are always taken to prevent damage or loss in this way.
It is also possible to write a context manager using generator syntax thanks to the contextlib.contextmanager
decorator:
__main__
Top-level script environment__main__
is the name of the scope in which top-level code executes. A module’s name is set equal to __main__
when read from standard input, a script, or from an interactive prompt.
A module can discover whether or not it is running in the main scope by checking its own __name__
, which allows a common idiom for conditionally executing code in a module when it is run as a script or with python -m
but not when it is imported:
For a package, the same effect can be achieved by including a main.py module, the contents of which will be executed when the module is run with -m
For example we are developing script which is designed to be used as module, we should do:
Every Python module has it’s __name__
defined and if this is __main__
, it implies that the module is being run standalone by the user and we can do corresponding appropriate actions.
If you import this script as a module in another script, the name is set to the name of the script/module.
Python files can act as either reusable modules, or as standalone programs.
if __name__ == “main”:
is used to execute some code only if the file was run directly, and not imported.
The setup script is the centre of all activity in building, distributing, and installing modules using the Distutils. The main purpose of the setup script is to describe your module distribution to the Distutils, so that the various commands that operate on your modules do the right thing.
The setup.py
file is at the heart of a Python project. It describes all of the metadata about your project. There a quite a few fields you can add to a project to give it a rich set of metadata describing the project. However, there are only three required fields: name, version, and packages. The name field must be unique if you wish to publish your package on the Python Package Index (PyPI). The version field keeps track of different releases of the project. The packages field describes where you’ve put the Python source code within your project.
This allows you to easily install Python packages. Often it's enough to write:
and module will install itself.
Our initial setup.py will also include information about the license and will re-use the README.txt file for the long_description field. This will look like:
Dataclasses
are python classes but are suited for storing data objects. This module provides a decorator and functions for automatically adding generated special methods such as __init__()
and __repr__()
to user-defined classes.
They store data and represent a certain data type. Ex: A number. For people familiar with ORMs, a model instance is a data object. It represents a specific kind of entity. It holds attributes that define or represent the entity.
They can be compared to other objects of the same type. Ex: A number can be greater than, less than, or equal to another number.
Python 3.7 provides a decorator dataclass that is used to convert a class into a dataclass.
python 2.7
with dataclass
It is easy to add default values to the fields of your data class.
It is mandatory to define the data type in dataclass. However, If you don't want specify the datatype then, use typing.Any
.
The use of a Virtual Environment is to test python code in encapsulated environments and to also avoid filling the base Python installation with libraries we might use for only one project.
Install virtualenv
Install virtualenvwrapper-win (Windows)
Usage:
Make a Virtual Environment
Anything we install now will be specific to this project. And available to the projects we connect to this environment.
Set Project Directory
To bind our virtualenv with our current working directory we simply enter:
Deactivate
To move onto something else in the command line type ‘deactivate’ to deactivate your environment.
Notice how the parenthesis disappear.
Workon
Open up the command prompt and type ‘workon HelloWold’ to activate the environment and move into your root project folder
Install Poetry
Create a new project
This will create a my-project directory:
The pyproject.toml file will orchestrate your project and its dependencies:
Packages
To add dependencies to your project, you can specify them in the tool.poetry.dependencies section:
Also, instead of modifying the pyproject.toml file by hand, you can use the add command and it will automatically find a suitable version constraint.
To install the dependencies listed in the pyproject.toml:
To remove dependencies:
Install pipenv
Enter your Project directory and install the Packages for your project
Pipenv will install your package and create a Pipfile for you in your project’s directory. The Pipfile is used to track which dependencies your project needs in case you need to re-install them.
Uninstall Packages
Activate the Virtual Environment associated with your Python project
Exit the Virtual Environment
Where packages, notebooks, projects and environments are shared. Your place for free public conda package hosting.
Usage:
Make a Virtual Environment
To use the Virtual Environment, activate it by:
Anything installed now will be specific to the project HelloWorld
Exit the Virtual Environment
From the Python 3
The itertools module is a collection of tools intended to be fast and use memory efficiently when handling iterators (like or ).
From the official :
The module will also be used. This module is not necessary when using itertools, but needed for some of the examples below.
Note: For new code, using or (Python 3.6+) is strongly recommended over the %
operator.
The official recommend str.format
over the %
operator:
pathlib
provides a lot more functionality than the ones listed above, like getting file name, getting file extension, reading/writing a file without manually opening it, etc. Check out the if you want to know more!
is a very handy package allowing to abstract completely the underlying configuration file format. It allows to load a Python dictionary from JSON, YAML, TOML, and so on.
Find more information visit .
is a tool for dependency management and packaging in Python. It allows you to declare the libraries your project depends on and it will manage (install/update) them for you.
For more information, check the .
is a tool that aims to bring the best of all packaging worlds (bundler, composer, npm, cargo, yarn, etc.) to the Python world. Windows is a first-class citizen, in our world.
Find more information and a video in .
is another popular tool to manage python packages.
From the :
From the Python 3
The itertools module is a collection of tools intended to be fast and use memory efficiently when handling iterators (like or ).
From the official :
The module will also be used. This module is not necessary when using itertools, but needed for some of the examples below.
Note: For new code, using or (Python 3.6+) is strongly recommended over the %
operator.
The official recommend str.format
over the %
operator:
pathlib
provides a lot more functionality than the ones listed above, like getting file name, getting file extension, reading/writing a file without manually opening it, etc. Check out the if you want to know more!
is a very handy package allowing to abstract completely the underlying configuration file format. It allows to load a Python dictionary from JSON, YAML, TOML, and so on.
Find more information visit .
is a tool for dependency management and packaging in Python. It allows you to declare the libraries your project depends on and it will manage (install/update) them for you.
For more information, check the .
is a tool that aims to bring the best of all packaging worlds (bundler, composer, npm, cargo, yarn, etc.) to the Python world. Windows is a first-class citizen, in our world.
Find more information and a video in .
is another popular tool to manage python packages.
Operators | Operation | Example |
** | Exponent |
|
% | Modulus/Remainder |
|
// | Integer division |
|
/ | Division |
|
* | Multiplication |
|
- | Subtraction |
|
+ | Addition |
|
Data Type | Examples |
Integers |
|
Floating-point numbers |
|
Strings |
|
Operator | Meaning |
| Equal to |
| Not equal to |
| Less than |
| Greater Than |
| Less than or Equal to |
| Greater than or Equal to |
Expression | Evaluates to |
|
|
|
|
|
|
|
|
Expression | Evaluates to |
|
|
|
|
|
|
|
|
Expression | Evaluates to |
|
|
|
|
Operator | Equivalent |
|
|
|
|
|
|
|
|
|
|
Escape character | Prints as |
| Single quote |
| Double quote |
| Tab |
| Newline (line break) |
| Backslash |
Symbol | Matches |
| zero or one of the preceding group. |
| zero or more of the preceding group. |
| one or more of the preceding group. |
| exactly n of the preceding group. |
| n or more of the preceding group. |
| 0 to m of the preceding group. |
| at least n and at most m of the preceding p. |
| performs a nongreedy match of the preceding p. |
| means the string must begin with spam. |
| means the string must end with spam. |
| any character, except newline characters. |
| a digit, word, or space character, respectively. |
| anything except a digit, word, or space, respectively. |
| any character between the brackets (such as a, b, ). |
| any character that isn’t between the brackets. |
Level | Logging Function | Description |
|
| The lowest level. Used for small details. Usually you care about these messages only when diagnosing problems. |
|
| Used to record information on general events in your program or confirm that things are working at their point in the program. |
|
| Used to indicate a potential problem that doesn’t prevent the program from working but might do so in the future. |
|
| Used to record an error that caused the program to fail to do something. |
|
| The highest level. Used to indicate a fatal error that has caused or is about to cause the program to stop running entirely. |
Operators | Operation | Example |
** | Exponent |
|
% | Modulus/Remainder |
|
// | Integer division |
|
/ | Division |
|
* | Multiplication |
|
- | Subtraction |
|
+ | Addition |
|
Data Type | Examples |
Integers |
|
Floating-point numbers |
|
Strings |
|
Operator | Meaning |
| Equal to |
| Not equal to |
| Less than |
| Greater Than |
| Less than or Equal to |
| Greater than or Equal to |
Expression | Evaluates to |
|
|
|
|
|
|
|
|
Expression | Evaluates to |
|
|
|
|
|
|
|
|
Expression | Evaluates to |
|
|
|
|
Operator | Equivalent |
|
|
|
|
|
|
|
|
|
|
Escape character | Prints as |
| Single quote |
| Double quote |
| Tab |
| Newline (line break) |
| Backslash |
Symbol | Matches |
| zero or one of the preceding group. |
| zero or more of the preceding group. |
| one or more of the preceding group. |
| exactly n of the preceding group. |
| n or more of the preceding group. |
| 0 to m of the preceding group. |
| at least n and at most m of the preceding p. |
| performs a nongreedy match of the preceding p. |
| means the string must begin with spam. |
| means the string must end with spam. |
| any character, except newline characters. |
| a digit, word, or space character, respectively. |
| anything except a digit, word, or space, respectively. |
| any character between the brackets (such as a, b, ). |
| any character that isn’t between the brackets. |
Level | Logging Function | Description |
|
| The lowest level. Used for small details. Usually you care about these messages only when diagnosing problems. |
|
| Used to record information on general events in your program or confirm that things are working at their point in the program. |
|
| Used to indicate a potential problem that doesn’t prevent the program from working but might do so in the future. |
|
| Used to record an error that caused the program to fail to do something. |
|
| The highest level. Used to indicate a fatal error that has caused or is about to cause the program to stop running entirely. |
The Zen of Python
About
Contribute
Read It
Python Cheatsheet
The Zen of Python
Python Basics
Math Operators
Data Types
String Concatenation and Replication
Variables
Comments
The print() Function
The input() Function
The len() Function
The str(), int(), and float() Functions
Flow Control
Comparison Operators
Boolean evaluation
Boolean Operators
Mixing Boolean and Comparison Operators
if Statements
else Statements
elif Statements
while Loop Statements
break Statements
continue Statements
for Loops and the range() Function
For else statement
Importing Modules
Ending a Program Early with sys.exit()
Functions
Return Values and return Statements
The None Value
Keyword Arguments and print()
Local and Global Scope
The global Statement
Exception Handling
Basic exception handling
Final code in exception handling
Lists
Getting Individual Values in a List with Indexes
Negative Indexes
Getting Sublists with Slices
Getting a List’s Length with len()
Changing Values in a List with Indexes
List Concatenation and List Replication
Removing Values from Lists with del Statements
Using for Loops with Lists
Looping Through Multiple Lists with zip()
The in and not in Operators
The Multiple Assignment Trick
Augmented Assignment Operators
Finding a Value in a List with the index() Method
Adding Values to Lists with the append() and insert() Methods
Removing Values from Lists with remove()
Removing Values from Lists with pop()
Sorting the Values in a List with the sort() Method
Tuple Data Type
Converting Types with the list() and tuple() Functions
Dictionaries and Structuring Data
The keys(), values(), and items() Methods
Checking Whether a Key or Value Exists in a Dictionary
The get() Method
The setdefault() Method
Pretty Printing
Merge two dictionaries
sets
Initializing a set
sets: unordered collections of unique elements
set add() and update()
set remove() and discard()
set union()
set intersection
set difference
set symetric_difference
itertools Module
accumulate()
combinations()
combinations_with_replacement()
count()
cycle()
chain()
compress()
dropwhile()
filterfalse()
groupby()
islice()
permutations()
product()
repeat()
starmap()
takewhile()
tee()
zip_longest()
Comprehensions
List comprehension
Set comprehension
Dict comprehension
Manipulating Strings
Escape Characters
Raw Strings
Multiline Strings with Triple Quotes
Indexing and Slicing Strings
The in and not in Operators with Strings
The in and not in Operators with list
The upper(), lower(), isupper(), and islower() String Methods
The isX String Methods
The startswith() and endswith() String Methods
The join() and split() String Methods
Justifying Text with rjust(), ljust(), and center()
Removing Whitespace with strip(), rstrip(), and lstrip()
Copying and Pasting Strings with the pyperclip Module (need pip install)
String Formatting
% operator
String Formatting (str.format)
Lazy string formatting
Formatted String Literals or f-strings (Python 3.6+)
Template Strings
Regular Expressions
Matching Regex Objects
Grouping with Parentheses
Matching Multiple Groups with the Pipe
Optional Matching with the Question Mark
Matching Zero or More with the Star
Matching One or More with the Plus
Matching Specific Repetitions with Curly Brackets
Greedy and Nongreedy Matching
The findall() Method
Making Your Own Character Classes
The Caret and Dollar Sign Characters
The Wildcard Character
Matching Everything with Dot-Star
Matching Newlines with the Dot Character
Review of Regex Symbols
Case-Insensitive Matching
Substituting Strings with the sub() Method
Managing Complex Regexes
Handling File and Directory Paths
Backslash on Windows and Forward Slash on OS X and Linux
The Current Working Directory
Creating New Folders
Absolute vs. Relative Paths
Handling Absolute and Relative Paths
Checking Path Validity
Finding File Sizes and Folder Contents
Copying Files and Folders
Moving and Renaming Files and Folders
Permanently Deleting Files and Folders
Safe Deletes with the send2trash Module
Walking a Directory Tree
Reading and Writing Files
The File Reading/Writing Process
Opening and reading files with the open() function
Writing to Files
Saving Variables with the shelve Module
Saving Variables with the pprint.pformat() Function
Reading ZIP Files
Extracting from ZIP Files
Creating and Adding to ZIP Files
JSON, YAML and configuration files
JSON
YAML
Anyconfig
Debugging
Raising Exceptions
Getting the Traceback as a String
Assertions
Logging
Logging Levels
Disabling Logging
Logging to a File
Lambda Functions
Ternary Conditional Operator
args and kwargs
Things to Remember(args)
Things to Remember(kwargs)
Context Manager
with statement
Writing your own contextmanager using generator syntax
__main__
Top-level script environment
Advantages
setup.py
Dataclasses
Features
Default values
Type hints
Virtual Environment
virtualenv
poetry
pipenv
anaconda
From the PEP 20 -- The Zen of Python:
Long time Pythoneer Tim Peters succinctly channels the BDFL's guiding principles for Python's design into 20 aphorisms, only 19 of which have been written down.
Return to the Top
From Highest to Lowest precedence:
Examples of expressions in the interactive shell:
Return to the Top
Return to the Top
String concatenation:
Note: Avoid +
operator for string concatenation. Prefer string formatting.
String Replication:
Return to the Top
You can name a variable anything as long as it obeys the following rules:
It can be only one word.
It can use only letters, numbers, and the underscore (_
) character.
It can’t begin with a number.
Variable name starting with an underscore (_
) are considered as "unuseful`.
Example:
_spam
should not be used again in the code.
Return to the Top
Inline comment:
Multiline comment:
Code with a comment:
Please note the two spaces in front of the comment.
Function docstring:
Return to the Top
Return to the Top
Example Code:
Return to the Top
Evaluates to the integer value of the number of characters in a string:
Note: test of emptiness of strings, lists, dictionary, etc, should not use len, but prefer direct boolean evaluation.
Return to the Top
Integer to String or Float:
Float to Integer:
Return to the Top
These operators evaluate to True or False depending on the values you give them.
Examples:
Never use ==
or !=
operator to evaluate boolean operation. Use the is
or is not
operators, or use implicit boolean evaluation.
NO (even if they are valid Python):
YES (even if they are valid Python):
These statements are equivalent:
And these as well:
Return to the Top
There are three Boolean operators: and, or, and not.
The and Operator’s Truth Table:
The or Operator’s Truth Table:
The not Operator’s Truth Table:
Return to the Top
You can also use multiple Boolean operators in an expression, along with the comparison operators:
Return to the Top
Return to the Top
Return to the Top
Return to the Top
Return to the Top
If the execution reaches a break statement, it immediately exits the while loop’s clause:
Return to the Top
When the program execution reaches a continue statement, the program execution immediately jumps back to the start of the loop.
Return to the Top
The range() function can also be called with three arguments. The first two arguments will be the start and stop values, and the third will be the step argument. The step is the amount that the variable is increased by after each iteration.
You can even use a negative number for the step argument to make the for loop count down instead of up.
This allows to specify a statement to execute in case of the full loop has been executed. Only useful when a break
condition can occur in the loop:
Return to the Top
Return to the Top
Return to the Top
Return to the Top
When creating a function using the def statement, you can specify what the return value should be with a return statement. A return statement consists of the following:
The return keyword.
The value or expression that the function should return.
Return to the Top
Note: never compare to None
with the ==
operator. Always use is
.
Return to the Top
Return to the Top
Code in the global scope cannot use any local variables.
However, a local scope can access global variables.
Code in a function’s local scope cannot use variables in any other local scope.
You can use the same name for different variables if they are in different scopes. That is, there can be a local variable named spam and a global variable also named spam.
Return to the Top
If you need to modify a global variable from within a function, use the global statement:
There are four rules to tell whether a variable is in a local scope or global scope:
If a variable is being used in the global scope (that is, outside of all functions), then it is always a global variable.
If there is a global statement for that variable in a function, it is a global variable.
Otherwise, if the variable is used in an assignment statement in the function, it is a local variable.
But if the variable is not used in an assignment statement, it is a global variable.
Return to the Top
Return to the Top
Code inside the finally
section is always executed, no matter if an exception has been raised or not, and even if an exception is not caught.
Return to the Top
Return to the Top
Return to the Top
Return to the Top
Slicing the complete list will perform a copy:
Return to the Top
Return to the Top
Return to the Top
Return to the Top
Return to the Top
Return to the Top
Return to the Top
The multiple assignment trick is a shortcut that lets you assign multiple variables with the values in a list in one line of code. So instead of doing this:
You could type this line of code:
The multiple assignment trick can also be used to swap the values in two variables:
Return to the Top
Examples:
Return to the Top
Return to the Top
append():
insert():
Return to the Top
If the value appears multiple times in the list, only the first instance of the value will be removed.
Return to the Top
Return to the Top
You can also pass True for the reverse keyword argument to have sort() sort the values in reverse order:
If you need to sort the values in regular alphabetical order, pass str. lower for the key keyword argument in the sort() method call:
You can use the built-in function sorted
to return a new list:
Return to the Top
The main way that tuples are different from lists is that tuples, like strings, are immutable.
Return to the Top
Return to the Top
Example Dictionary:
Return to the Top
values():
keys():
items():
Using the keys(), values(), and items() methods, a for loop can iterate over the keys, values, or key-value pairs in a dictionary, respectively.
Return to the Top
Return to the Top
Get has two parameters: key and default value if the key did not exist
Return to the Top
Let's consider this code:
Using setdefault
we could write the same code more succinctly:
Return to the Top
Return to the Top
From the Python 3 documentation
A set is an unordered collection with no duplicate elements. Basic uses include membership testing and eliminating duplicate entries. Set objects also support mathematical operations like union, intersection, difference, and symmetric difference.
There are two ways to create sets: using curly braces {}
and the built-in function set()
When creating an empty set, be sure to not use the curly braces {}
or you will get an empty dictionary instead.
A set automatically remove all the duplicate values.
And as an unordered data type, they can't be indexed.
Using the add()
method we can add a single element to the set.
And with update()
, multiple ones .
Both methods will remove an element from the set, but remove()
will raise a key error
if the value doesn't exist.
discard()
won't raise any errors.
union()
or |
will create a new set that contains all the elements from the sets provided.
intersection
or &
will return a set containing only the elements that are common to all of them.
difference
or -
will return only the elements that are unique to the first set (invoked set).
symetric_difference
or ^
will return all the elements that are not common between them.
Return to the Top
The itertools module is a collection of tools intended to be fast and use memory efficiently when handling iterators (like lists or dictionaries).
From the official Python 3.x documentation:
The module standardizes a core set of fast, memory efficient tools that are useful by themselves or in combination. Together, they form an “iterator algebra” making it possible to construct specialized tools succinctly and efficiently in pure Python.
The itertools module comes in the standard library and must be imported.
The operator module will also be used. This module is not necessary when using itertools, but needed for some of the examples below.
Return to the Top
Makes an iterator that returns the results of a function.
Example:
The operator.mul takes two numbers and multiplies them:
Passing a function is optional:
If no function is designated the items will be summed:
Return to the Top
Takes an iterable and a integer. This will create all the unique combination that have r members.
Example:
Return to the Top
Just like combinations(), but allows individual elements to be repeated more than once.
Example:
Return to the Top
Makes an iterator that returns evenly spaced values starting with number start.
Example:
Return to the Top
This function cycles through an iterator endlessly.
Example:
When reached the end of the iterable it start over again from the beginning.
Return to the Top
Take a series of iterables and return them as one long iterable.
Example:
Return to the Top
Filters one iterable with another.
Example:
Return to the Top
Make an iterator that drops elements from the iterable as long as the predicate is true; afterwards, returns every element.
Example:
Return to the Top
Makes an iterator that filters elements from iterable returning only those for which the predicate is False.
Example:
Return to the Top
Simply put, this function groups things together.
Example:
Return to the Top
This function is very much like slices. This allows you to cut out a piece of an iterable.
Example:
Return to the Top
Example:
Return to the Top
Creates the cartesian products from a series of iterables.
Return to the Top
This function will repeat an object over and over again. Unless, there is a times argument.
Example:
Return to the Top
Makes an iterator that computes the function using arguments obtained from the iterable.
Example:
Return to the Top
The opposite of dropwhile(). Makes an iterator and returns elements from the iterable as long as the predicate is true.
Example:
Return to the Top
Return n independent iterators from a single iterable.
Example:
Return to the Top
Makes an iterator that aggregates elements from each of the iterables. If the iterables are of uneven length, missing values are filled-in with fillvalue. Iteration continues until the longest iterable is exhausted.
Example:
Return to the Top
A List comprehension can be generated from a dictionary:
Example:
Return to the Top
A raw string completely ignores all escape characters and prints any backslash that appears in the string.
Note: mostly used for regular expression definition (see re
package)
Return to the Top
To keep a nicer flow in your code, you can use the dedent
function from the textwrap
standard package.
This generates the same string than before.
Return to the Top
Slicing:
Return to the Top
Return to the Top
upper()
and lower()
:
isupper() and islower():
Return to the Top
isalpha() returns True if the string consists only of letters and is not blank.
isalnum() returns True if the string consists only of letters and numbers and is not blank.
isdecimal() returns True if the string consists only of numeric characters and is not blank.
isspace() returns True if the string consists only of spaces,tabs, and new-lines and is not blank.
istitle() returns True if the string consists only of words that begin with an uppercase letter followed by only lowercase letters.
Return to the Top
Return to the Top
join():
split():
Return to the Top
rjust() and ljust():
An optional second argument to rjust() and ljust() will specify a fill character other than a space character. Enter the following into the interactive shell:
center():
Return to the Top
Return to the Top
Return to the Top
We can use the %x
format specifier to convert an int value to a string:
Note: For new code, using str.format or f-strings (Python 3.6+) is strongly recommended over the %
operator.
Return to the Top
Python 3 introduced a new way to do string formatting that was later back-ported to Python 2.7. This makes the syntax for string formatting more regular.
The official Python 3.x documentation recommend str.format
over the %
operator:
The formatting operations described here exhibit a variety of quirks that lead to a number of common errors (such as failing to display tuples and dictionaries correctly). Using the newer formatted string literals or the str.format() interface helps avoid these errors. These alternatives also provide more powerful, flexible and extensible approaches to formatting text.
Return to the Top
You would only use %s
string formatting on functions that can do lazy parameters evaluation, the most common being logging:
Prefer:
Over:
Or:
Return to the Top
It is even possible to do inline arithmetic with it:
Return to the Top
A simpler and less powerful mechanism, but it is recommended when handling format strings generated by users. Due to their reduced complexity template strings are a safer choice.
Return to the Top
Import the regex module with import re
.
Create a Regex object with the re.compile()
function. (Remember to use a raw string.)
Pass the string you want to search into the Regex object’s search()
method. This returns a Match
object.
Call the Match object’s group()
method to return a string of the actual matched text.
All the regex functions in Python are in the re module:
Return to the Top
Return to the Top
To retrieve all the groups at once: use the groups() method—note the plural form for the name.
Return to the Top
The | character is called a pipe. You can use it anywhere you want to match one of many expressions. For example, the regular expression r'Batman|Tina Fey' will match either 'Batman' or 'Tina Fey'.
You can also use the pipe to match one of several patterns as part of your regex:
Return to the Top
The ? character flags the group that precedes it as an optional part of the pattern.
Return to the Top
The * (called the star or asterisk) means “match zero or more”—the group that precedes the star can occur any number of times in the text.
Return to the Top
While * means “match zero or more,” the + (or plus) means “match one or more”. The group preceding a plus must appear at least once. It is not optional:
Return to the Top
If you have a group that you want to repeat a specific number of times, follow the group in your regex with a number in curly brackets. For example, the regex (Ha){3} will match the string 'HaHaHa', but it will not match 'HaHa', since the latter has only two repeats of the (Ha) group.
Instead of one number, you can specify a range by writing a minimum, a comma, and a maximum in between the curly brackets. For example, the regex (Ha){3,5} will match 'HaHaHa', 'HaHaHaHa', and 'HaHaHaHaHa'.
Return to the Top
Python’s regular expressions are greedy by default, which means that in ambiguous situations they will match the longest string possible. The non-greedy version of the curly brackets, which matches the shortest string possible, has the closing curly bracket followed by a question mark.
Return to the Top
In addition to the search() method, Regex objects also have a findall() method. While search() will return a Match object of the first matched text in the searched string, the findall() method will return the strings of every match in the searched string.
To summarize what the findall() method returns, remember the following:
When called on a regex with no groups, such as \d-\d\d\d-\d\d\d\d, the method findall() returns a list of ng matches, such as ['415-555-9999', '212-555-0000'].
When called on a regex that has groups, such as (\d\d\d)-(d\d)-(\d\d\d\d), the method findall() returns a list of es of strings (one string for each group), such as [('415', '555', '9999'), ('212', '555', '0000')].
Return to the Top
There are times when you want to match a set of characters but the shorthand character classes (\d, \w, \s, and so on) are too broad. You can define your own character class using square brackets. For example, the character class [aeiouAEIOU] will match any vowel, both lowercase and uppercase.
You can also include ranges of letters or numbers by using a hyphen. For example, the character class [a-zA-Z0-9] will match all lowercase letters, uppercase letters, and numbers.
By placing a caret character (^) just after the character class’s opening bracket, you can make a negative character class. A negative character class will match all the characters that are not in the character class. For example, enter the following into the interactive shell:
Return to the Top
You can also use the caret symbol (^) at the start of a regex to indicate that a match must occur at the beginning of the searched text.
Likewise, you can put a dollar sign ($) at the end of the regex to indicate the string must end with this regex pattern.
And you can use the ^ and $ together to indicate that the entire string must match the regex—that is, it’s not enough for a match to be made on some subset of the string.
The r'^Hello' regular expression string matches strings that begin with 'Hello':
The r'\d$' regular expression string matches strings that end with a numeric character from 0 to 9:
Return to the Top
The . (or dot) character in a regular expression is called a wildcard and will match any character except for a newline:
Return to the Top
The dot-star uses greedy mode: It will always try to match as much text as possible. To match any and all text in a nongreedy fashion, use the dot, star, and question mark (.*?). The question mark tells Python to match in a nongreedy way:
Return to the Top
The dot-star will match everything except a newline. By passing re.DOTALL as the second argument to re.compile(), you can make the dot character match all characters, including the newline character:
Return to the Top
Return to the Top
To make your regex case-insensitive, you can pass re.IGNORECASE or re.I as a second argument to re.compile():
Return to the Top
The sub() method for Regex objects is passed two arguments:
The first argument is a string to replace any matches.
The second is the string for the regular expression.
The sub() method returns a string with the substitutions applied:
Another example:
Return to the Top
To tell the re.compile() function to ignore whitespace and comments inside the regular expression string, “verbose mode” can be enabled by passing the variable re.VERBOSE as the second argument to re.compile().
Now instead of a hard-to-read regular expression like this:
you can spread the regular expression over multiple lines with comments like this:
Return to the Top
There are two main modules in Python that deals with path manipulation. One is the os.path
module and the other is the pathlib
module. The pathlib
module was added in Python 3.4, offering an object-oriented way to handle file system paths.
Return to the Top
On Windows, paths are written using backslashes (\
) as the separator between folder names. On Unix based operating system such as macOS, Linux, and BSDs, the forward slash (/
) is used as the path separator. Joining paths can be a headache if your code needs to work on different platforms.
Fortunately, Python provides easy ways to handle this. We will showcase how to deal with this with both os.path.join
and pathlib.Path.joinpath
Using os.path.join
on Windows:
And using pathlib
on *nix:
pathlib
also provides a shortcut to joinpath using the /
operator:
Notice the path separator is different between Windows and Unix based operating system, that's why you want to use one of the above methods instead of adding strings together to join paths together.
Joining paths is helpful if you need to create different file paths under the same directory.
Using os.path.join
on Windows:
Using pathlib
on *nix:
Return to the Top
Using os
on Windows:
Using pathlib
on *nix:
Return to the Top
Using os
on Windows:
Using pathlib
on *nix:
Oh no, we got a nasty error! The reason is that the 'delicious' directory does not exist, so we cannot make the 'walnut' and the 'waffles' directories under it. To fix this, do:
And all is good :)
Return to the Top
There are two ways to specify a file path.
An absolute path, which always begins with the root folder
A relative path, which is relative to the program’s current working directory
There are also the dot (.) and dot-dot (..) folders. These are not real folders but special names that can be used in a path. A single period (“dot”) for a folder name is shorthand for “this directory.” Two periods (“dot-dot”) means “the parent folder.”
Return to the Top
To see if a path is an absolute path:
Using os.path
on *nix:
Using pathlib
on *nix:
You can extract an absolute path with both os.path
and pathlib
Using os.path
on *nix:
Using pathlib
on *nix:
You can get a relative path from a starting path to another path.
Using os.path
on *nix:
Using pathlib
on *nix:
Return to the Top
Checking if a file/directory exists:
Using os.path
on *nix:
Using pathlib
on *nix:
Checking if a path is a file:
Using os.path
on *nix:
Using pathlib
on *nix:
Checking if a path is a directory:
Using os.path
on *nix:
Using pathlib
on *nix:
Return to the Top
Getting a file's size in bytes:
Using os.path
on Windows:
Using pathlib
on *nix:
Listing directory contents using os.listdir
on Windows:
Listing directory contents using pathlib
on *nix:
To find the total size of all the files in this directory:
WARNING: Directories themselves also have a size! So you might want to check for whether a path is a file or directory using the methods in the methods discussed in the above section!
Using os.path.getsize()
and os.listdir()
together on Windows:
Using pathlib
on *nix:
Return to the Top
The shutil module provides functions for copying files, as well as entire folders.
While shutil.copy() will copy a single file, shutil.copytree() will copy an entire folder and every folder and file contained in it:
Return to the Top
The destination path can also specify a filename. In the following example, the source file is moved and renamed:
If there is no eggs folder, then move() will rename bacon.txt to a file named eggs.
Return to the Top
Calling os.unlink(path) or Path.unlink() will delete the file at path.
Calling os.rmdir(path) or Path.rmdir() will delete the folder at path. This folder must be empty of any files or folders.
Calling shutil.rmtree(path) will remove the folder at path, and all files and folders it contains will also be deleted.
Return to the Top
You can install this module by running pip install send2trash from a Terminal window.
Return to the Top
Return to the Top
pathlib
provides a lot more functionality than the ones listed above, like getting file name, getting file extension, reading/writing a file without manually opening it, etc. Check out the official documentation if you want to know more!
To read/write to a file in Python, you will want to use the with
statement, which will close the file for you after you are done.
Return to the Top
Return to the Top
Return to the Top
To save variables:
To open and read variables:
Just like dictionaries, shelf values have keys() and values() methods that will return list-like values of the keys and values in the shelf. Since these methods return list-like values instead of true lists, you should pass them to the list() function to get them in list form.
Return to the Top
Return to the Top
Return to the Top
The extractall() method for ZipFile objects extracts all the files and folders from a ZIP file into the current working directory.
The extract() method for ZipFile objects will extract a single file from the ZIP file. Continue the interactive shell example:
Return to the Top
This code will create a new ZIP file named new.zip that has the compressed contents of spam.txt.
Return to the Top
Open a JSON file with:
Write a JSON file with:
Return to the Top
Compared to JSON, YAML allows for much better human maintainability and gives you the option to add comments. It is a convenient choice for configuration files where humans will have to edit it.
There are two main libraries allowing to access to YAML files:
Install them using pip install
in your virtual environment.
The first one it easier to use but the second one, Ruamel, implements much better the YAML specification, and allow for example to modify a YAML content without altering comments.
Open a YAML file with:
Return to the Top
Anyconfig is a very handy package allowing to abstract completely the underlying configuration file format. It allows to load a Python dictionary from JSON, YAML, TOML, and so on.
Install it with:
Usage:
Return to the Top
Exceptions are raised with a raise statement. In code, a raise statement consists of the following:
The raise keyword
A call to the Exception() function
A string with a helpful error message passed to the Exception() function
Often it’s the code that calls the function, not the function itself, that knows how to handle an exception. So you will commonly see a raise statement inside a function and the try and except statements in the code calling the function.
Return to the Top
The traceback is displayed by Python whenever a raised exception goes unhandled. But can also obtain it as a string by calling traceback.format_exc(). This function is useful if you want the information from an exception’s traceback but also want an except statement to gracefully handle the exception. You will need to import Python’s traceback module before calling this function.
The 116 is the return value from the write() method, since 116 characters were written to the file. The traceback text was written to errorInfo.txt.
Return to the Top
An assertion is a sanity check to make sure your code isn’t doing something obviously wrong. These sanity checks are performed by assert statements. If the sanity check fails, then an AssertionError exception is raised. In code, an assert statement consists of the following:
The assert keyword
A condition (that is, an expression that evaluates to True or False)
A comma
A string to display when the condition is False
In plain English, an assert statement says, “I assert that this condition holds true, and if not, there is a bug somewhere in the program.” Unlike exceptions, your code should not handle assert statements with try and except; if an assert fails, your program should crash. By failing fast like this, you shorten the time between the original cause of the bug and when you first notice the bug. This will reduce the amount of code you will have to check before finding the code that’s causing the bug.
Disabling Assertions
Assertions can be disabled by passing the -O option when running Python.
Return to the Top
To enable the logging module to display log messages on your screen as your program runs, copy the following to the top of your program (but under the #! python shebang line):
Say you wrote a function to calculate the factorial of a number. In mathematics, factorial 4 is 1 × 2 × 3 × 4, or 24. Factorial 7 is 1 × 2 × 3 × 4 × 5 × 6 × 7, or 5,040. Open a new file editor window and enter the following code. It has a bug in it, but you will also enter several log messages to help yourself figure out what is going wrong. Save the program as factorialLog.py.
Return to the Top
Logging levels provide a way to categorize your log messages by importance. There are five logging levels, described in Table 10-1 from least to most important. Messages can be logged at each level using a different logging function.
Return to the Top
After you’ve debugged your program, you probably don’t want all these log messages cluttering the screen. The logging.disable() function disables these so that you don’t have to go into your program and remove all the logging calls by hand.
Return to the Top
Instead of displaying the log messages to the screen, you can write them to a text file. The logging.basicConfig() function takes a filename keyword argument, like so:
Return to the Top
This function:
Is equivalent to the lambda function:
It's not even need to bind it to a name like add before:
Like regular nested functions, lambdas also work as lexical closures:
Note: lambda can only evaluate an expression, like a single line of code.
Return to the Top
Many programming languages have a ternary operator, which define a conditional expression. The most common usage is to make a terse simple conditional assignment statement. In other words, it offers one-line code to evaluate the first expression if the condition is true, otherwise it evaluates the second expression.
Example:
Ternary operators can be chained:
The code above is equivalent to:
Return to the Top
The names args and kwargs
are arbitrary - the important thing are the *
and **
operators. They can mean:
In a function declaration, *
means “pack all remaining positional arguments into a tuple named <name>
”, while **
is the same for keyword arguments (except it uses a dictionary, not a tuple).
In a function call, *
means “unpack tuple or list named <name>
to positional arguments at this position”, while **
is the same for keyword arguments.
For example you can make a function that you can use to call any other function, no matter what parameters it has:
Inside forward, args is a tuple (of all positional arguments except the first one, because we specified it - the f), kwargs is a dict. Then we call f and unpack them so they become normal arguments to f.
You use *args
when you have an indefinite amount of positional arguments.
Similarly, you use **kwargs
when you have an indefinite number of keyword arguments.
Functions can accept a variable number of positional arguments by using *args
in the def statement.
You can use the items from a sequence as the positional arguments for a function with the *
operator.
Using the *
operator with a generator may cause your program to run out of memory and crash.
Adding new positional parameters to functions that accept *args
can introduce hard-to-find bugs.
Function arguments can be specified by position or by keyword.
Keywords make it clear what the purpose of each argument is when it would be confusing with only positional arguments.
Keyword arguments with default values make it easy to add new behaviors to a function, especially when the function has existing callers.
Optional keyword arguments should always be passed by keyword instead of by position.
Return to the Top
While Python's context managers are widely used, few understand the purpose behind their use. These statements, commonly used with reading and writing files, assist the application in conserving system memory and improve resource management by ensuring specific resources are only in use for certain processes.
A context manager is an object that is notified when a context (a block of code) starts and ends. You commonly use one with the with statement. It takes care of the notifying.
For example, file objects are context managers. When a context ends, the file object is closed automatically:
Anything that ends execution of the block causes the context manager's exit method to be called. This includes exceptions, and can be useful when an error causes you to prematurely exit from an open file or connection. Exiting a script without properly closing files/connections is a bad idea, that may cause data loss or other problems. By using a context manager you can ensure that precautions are always taken to prevent damage or loss in this way.
It is also possible to write a context manager using generator syntax thanks to the contextlib.contextmanager
decorator:
Return to the Top
__main__
Top-level script environment__main__
is the name of the scope in which top-level code executes. A module’s name is set equal to __main__
when read from standard input, a script, or from an interactive prompt.
A module can discover whether or not it is running in the main scope by checking its own __name__
, which allows a common idiom for conditionally executing code in a module when it is run as a script or with python -m
but not when it is imported:
For a package, the same effect can be achieved by including a main.py module, the contents of which will be executed when the module is run with -m
For example we are developing script which is designed to be used as module, we should do:
Every Python module has it’s __name__
defined and if this is __main__
, it implies that the module is being run standalone by the user and we can do corresponding appropriate actions.
If you import this script as a module in another script, the name is set to the name of the script/module.
Python files can act as either reusable modules, or as standalone programs.
if __name__ == “main”:
is used to execute some code only if the file was run directly, and not imported.
Return to the Top
The setup script is the centre of all activity in building, distributing, and installing modules using the Distutils. The main purpose of the setup script is to describe your module distribution to the Distutils, so that the various commands that operate on your modules do the right thing.
The setup.py
file is at the heart of a Python project. It describes all of the metadata about your project. There a quite a few fields you can add to a project to give it a rich set of metadata describing the project. However, there are only three required fields: name, version, and packages. The name field must be unique if you wish to publish your package on the Python Package Index (PyPI). The version field keeps track of different releases of the project. The packages field describes where you’ve put the Python source code within your project.
This allows you to easily install Python packages. Often it's enough to write:
and module will install itself.
Our initial setup.py will also include information about the license and will re-use the README.txt file for the long_description field. This will look like:
Find more information visit http://docs.python.org/install/index.html.
Return to the Top
Dataclasses
are python classes but are suited for storing data objects. This module provides a decorator and functions for automatically adding generated special methods such as __init__()
and __repr__()
to user-defined classes.
They store data and represent a certain data type. Ex: A number. For people familiar with ORMs, a model instance is a data object. It represents a specific kind of entity. It holds attributes that define or represent the entity.
They can be compared to other objects of the same type. Ex: A number can be greater than, less than, or equal to another number.
Python 3.7 provides a decorator dataclass that is used to convert a class into a dataclass.
python 2.7
with dataclass
Return to the Top
It is easy to add default values to the fields of your data class.
It is mandatory to define the data type in dataclass. However, If you don't want specify the datatype then, use typing.Any
.
Return to the Top
The use of a Virtual Environment is to test python code in encapsulated environments and to also avoid filling the base Python installation with libraries we might use for only one project.
Return to the Top
Install virtualenv
Install virtualenvwrapper-win (Windows)
Usage:
Make a Virtual Environment
Anything we install now will be specific to this project. And available to the projects we connect to this environment.
Set Project Directory
To bind our virtualenv with our current working directory we simply enter:
Deactivate
To move onto something else in the command line type ‘deactivate’ to deactivate your environment.
Notice how the parenthesis disappear.
Workon
Open up the command prompt and type ‘workon HelloWold’ to activate the environment and move into your root project folder
Return to the Top
Poetry is a tool for dependency management and packaging in Python. It allows you to declare the libraries your project depends on and it will manage (install/update) them for you.
Install Poetry
Create a new project
This will create a my-project directory:
The pyproject.toml file will orchestrate your project and its dependencies:
Packages
To add dependencies to your project, you can specify them in the tool.poetry.dependencies section:
Also, instead of modifying the pyproject.toml file by hand, you can use the add command and it will automatically find a suitable version constraint.
To install the dependencies listed in the pyproject.toml:
To remove dependencies:
For more information, check the documentation.
Return to the Top
Pipenv is a tool that aims to bring the best of all packaging worlds (bundler, composer, npm, cargo, yarn, etc.) to the Python world. Windows is a first-class citizen, in our world.
Install pipenv
Enter your Project directory and install the Packages for your project
Pipenv will install your package and create a Pipfile for you in your project’s directory. The Pipfile is used to track which dependencies your project needs in case you need to re-install them.
Uninstall Packages
Activate the Virtual Environment associated with your Python project
Exit the Virtual Environment
Find more information and a video in docs.pipenv.org.
Return to the Top
Anaconda is another popular tool to manage python packages.
Where packages, notebooks, projects and environments are shared. Your place for free public conda package hosting.
Usage:
Make a Virtual Environment
To use the Virtual Environment, activate it by:
Anything installed now will be specific to the project HelloWorld
Exit the Virtual Environment
Return to the Top
Operators
Operation
Example
**
Exponent
2 ** 3 = 8
%
Modulus/Remainder
22 % 8 = 6
//
Integer division
22 // 8 = 2
/
Division
22 / 8 = 2.75
*
Multiplication
3 * 3 = 9
-
Subtraction
5 - 2 = 3
+
Addition
2 + 2 = 4
Data Type
Examples
Integers
-2, -1, 0, 1, 2, 3, 4, 5
Floating-point numbers
-1.25, -1.0, --0.5, 0.0, 0.5, 1.0, 1.25
Strings
'a', 'aa', 'aaa', 'Hello!', '11 cats'
Operator
Meaning
==
Equal to
!=
Not equal to
<
Less than
>
Greater Than
<=
Less than or Equal to
>=
Greater than or Equal to
Expression
Evaluates to
True and True
True
True and False
False
False and True
False
False and False
False
Expression
Evaluates to
True or True
True
True or False
True
False or True
True
False or False
False
Expression
Evaluates to
not True
False
not False
True
Operator
Equivalent
spam += 1
spam = spam + 1
spam -= 1
spam = spam - 1
spam *= 1
spam = spam * 1
spam /= 1
spam = spam / 1
spam %= 1
spam = spam % 1
Escape character
Prints as
\'
Single quote
\"
Double quote
\t
Tab
\n
Newline (line break)
\\
Backslash
Symbol
Matches
?
zero or one of the preceding group.
*
zero or more of the preceding group.
+
one or more of the preceding group.
{n}
exactly n of the preceding group.
{n,}
n or more of the preceding group.
{,m}
0 to m of the preceding group.
{n,m}
at least n and at most m of the preceding p.
{n,m}?
or *?
or +?
performs a nongreedy match of the preceding p.
^spam
means the string must begin with spam.
spam$
means the string must end with spam.
.
any character, except newline characters.
\d
, \w
, and \s
a digit, word, or space character, respectively.
\D
, \W
, and \S
anything except a digit, word, or space, respectively.
[abc]
any character between the brackets (such as a, b, ).
[^abc]
any character that isn’t between the brackets.
Level
Logging Function
Description
DEBUG
logging.debug()
The lowest level. Used for small details. Usually you care about these messages only when diagnosing problems.
INFO
logging.info()
Used to record information on general events in your program or confirm that things are working at their point in the program.
WARNING
logging.warning()
Used to indicate a potential problem that doesn’t prevent the program from working but might do so in the future.
ERROR
logging.error()
Used to record an error that caused the program to fail to do something.
CRITICAL
logging.critical()
The highest level. Used to indicate a fatal error that has caused or is about to cause the program to stop running entirely.