♦ πŸ† 3 min, 🐌 6 min

🐍 Python: Standard library

Python, like all languages ships with an extended amount of modules/packages you can use. I'm always surprised how much is built into the library.

Why is it so important to know what's available to you? Well for starters if you know what's at your disposal, you can reuse a lot of tools others wrote. Plus if you stick with the implementations form the standard library your code will be more robust, less external library dependencies, which is always a good practice in software development.

Here's the thing python standard is one of the most stable parts of the ecosystem out there. Core language developers are really, really careful with the changes they make. Changes are advertised in advance, tested, ... Which can't be said for all other libraries out there.

Now to be clear. Use a library that's not part of the standard if it fits your need, but think twice, because every crap you bring into your codebase will make your life harder in the future.

Here you can find the list of all python 3 modules. In the next few paragraphs, we'll cover a few that are not too specialised.

How to use a module?

For example, module collections.

Module full of container like data types like: OrderedDict and deque (list-like container with fast appends and pops on either end).

To import part of a module:

from collections import OrderedDict

d = OrderedDict()

Or import the entire module:

import collections

d = collections.OrderedDict()

IO

One large aspect that has great support in python is IO writing to the disk and reading from it.
  • json: Module for reading .JSON files. More about json in the upcoming post.
  • csv: Write and read tabular data to and from delimited files.
  • gzip: Interfaces for gzip compression and decompression using file objects.
  • hashlib: Secure hash and message digest algorithms.
  • pathlib: Object-oriented filesystem paths. Supper handy to check if file paths exist, ...
  • sqlite3: A DB-API implementation using SQLite 3.x.
  • aifc: Read and write audio files in AIFF or AIFC format.
  • tarfile: Read and write tar-format archive files.
  • io: Core tools for working with streams of data (binary file readers ...).
  • zipfile: Read and write ZIP-format archive files.
  • xml: Package containing XML processing modules
  • copy: Shallow and deep copy operations. To perform deep copies of objects.
  • types: Names of built-in types. Use them to "fix" data structures.

There are also some caching options:

from functools import lru_cache

lru_cache is a Last Recently Used cache, so there is no expiration time for the items in it.

There's also shelve: python object persistence package, which allows you to store objects in key value like manner for the cycle of your program. For example, to create a shelve do:

with shelve.open('spam') as db:
db['eggs'] = 'eggs'

Then you can call objects from the db anywhere in the scope of with. This was if the resource has already been loaded from the disk to the memory you can read the resource from the cache which is much faster than disk access.

Parallelization, background processes

  • threading: Thread-based parallelism.
  • multiprocessing: Process-based parallelism.
  • queue: The queue module implements multi-producer, multi-consumer queues. It is especially useful in threaded programming when information must be exchanged safely between multiple threads.
  • asyncio: Asynchronous I/O. Execute code forward and let the processes run until it runs

GUI

There's tkinter. An interface to Tcl/Tk for building really simple graphical user interfaces. I'm not a big fan of it.

Reliability

There are a few built-in modules in python standard that can help you make your codebase more reliable, faster, readable and modular:

  • warning: Issue warning messages.
  • logging: Flexible event logging system for applications. Especially to store error logs.
  • cProfile: For profiling python programs. To measure how long certain portions of the code take to execute. Don't mindlessly optimise. Write the code. Identify the slow stuff and optimise that.
  • pydoc: Documentation generator and online help system.
  • venv: Creation of virtual environments. Use this to ensure that others can run your code by creating a virtual environment, fast. No extra dependencies needed.

Time

  • datetime: Basic date and time types.
  • zoneinfo: IANA time zone support.
  • time: Time access and conversions.
  • calendar: Module for a calendar like operations.

Networking

Networking tools are really well supported in python:

  • mailbox: Manipulate mailboxes in various formats.
  • webbrowser: Easy-to-use controller for Web browsers.
  • smtpd: A SMTP server implementation in Python.
  • smtplib: SMTP protocol client
  • socketserver: A framework for network servers.
  • ssl: TLS/SSL wrapper for socket objects.
  • email: Package supporting the parsing, manipulating, and generating email messages.
  • socket: Low-level networking interface.
  • ftplib: FTP protocol client.
  • html: Helpers for manipulating HTML.
  • http: HTTP status codes and messages.
  • urllib: URL handling module.

Math and numerics

There's quite a few things you can do mathematics wise with pure python before you bring in the crap like scipy and numpy. To some extent, those libraries are great but hide too much implementation details. Which is fucking dangers. So use the basics from python whenever you can.

  • statistics: Mathematical statistics functions.
  • math: Math functions exp, sin, ...
  • cmath: Mathematical functions for complex numbers.
  • random: Generate pseudo-random numbers with various common distributions.

Command line/OS interface

Having a nice interface with the underlying OS is always nice. bash is painful to write so if control from python is on the table, sign me up.

  • subprocess: Subprocess management.
  • os: Module with os.system() option that allows us to run shell commands among other functionality.
  • sys: Access system-specific parameters and functions.
  • sysconfig: Python's configuration information.
  • syslog: An interface to the Unix syslog library routines.
  • pipes: Interface to shell pipelines.
  • argparse: Command-line option and argument parsing library.

Other cool modules:

  • difflib: Helpers for computing differences between objects. Can come in handy if you want to track changes.
  • importlib: The implementation of the import machinery. This way you can import python modules, packages, folders, ... programmatically and not by typing boilerplate code.
  • pprint: Data pretty printer.
  • 're`: Regular expression operations.

That's it. In the upcoming post, we'll explore some of the features in-depth.

🐍 Python series:

Get notified & read regularly πŸ‘‡