A Look at the Requests Python Package
My post last week asked whether it would be advisable to create a “canon of code” for programmers to be familiar with in order to create good programs. It might be helpful to restate once again the motivation for this. In college I studied music performance, primarily jazz. One of the reasons I stopped studying music is that it became abundantly clear to me that the best musicians were those who knew musical traditions inside and out, and I wasn’t one of those people. For example, if you mentioned an artist who was moderately famous but but had been dead for forty years, a good musician would be able to tell you about what made the artist’s music unique and influential. Likewise, the best writers are thoroughly steeped in the literary tradition. Reading Murakami’s or Hitchen’s commentaries and you find references to past literature littered throughout.
To dispel any misconceptions, I do not think that programming is an artistic endeavour on par with music or writing. Music and writing are primarily aesthetic endeavours. Programs, on the other hand, have a functional purpose. That being said, some modes of writings, such as persuasive essays, also has a function and can also be critiqued by how well it meets it persuades. Programming, writing, and making music, in any case, is more enjoyable to partake in when it’s done with style. For this reason, it’s important for programmers to understand what constitutes good style in their field. The best way to do that is to study successful programs and libraries, and the first library we’ll take a look at is Requests by Kenneth Reitz, Ian Cordasco, and Cory Benfield.
Requests is a popular Python package for high-level network communication over HTTP. Its popularity is attested to by its 400,000 daily downloads, which has led some even to consider the question of whether it be included in the Python standard library. Its chief maintainer (he goes by BDFL, a dumb term) has said that will never happen because inclusion in “the standard library is where a library goes to die.” Once a library has been included in the standard library it cannot introduce breaking changes and the development cadence becomes tied to the Python release cycle. For these reasons its completely understandable why the authors would want to stick on the 3rd-party package track.
You might be thinking that Requests is popular and widely-used because it’s providing functions that was not available in the standard library or other packages, but you’d be wrong. urllib.request in the standard library “defines functions and classes which help in opening URLs (mostly HTTP) in a complex world–basic and digest authentication, redirections, cookies and more.” Yet near the top of the documentation for urllib.request one finds called-out: “The Requests package is recommended for a higher-level HTTP interface.” It seems that the Python maintainers have figured out that, for most people working with HTTP, the interface provided by urllib.request is overly granular and too complex for what people need. Strangely, urllib.request is not the most granular of the python standard libraries dealing with HTTP. There’s also the http module, which urllib builds on top of. There are also other 3rd-party packages, such as httplib2, that provide a higher-level interface that users have come to expect from libraries of this sort, but in the past few years Requests has become the de-facto standard library for HTTP communication.
Surprisingly, Requests is a small library. The latest version (2.18.4) clocks in at around 5,000 lines of code, plus an additional 4,000 for tests. This is surprisingly few lines of code for what many people might consider to be a foundational Python library. At a high-level, Requests can be broken into the following components:
- adapters.py - This module declares the HTTP transport adapters in Requests. Adapters do the actual sending and receiving of data.
- api.py - The suggested public API of Requests. I say suggested because Requests exposes other classes that you might want to use if you have unique needs, such as Sessions, Requests, and Responses, but a major part of Request’s appeal and adoption is its tiny API. This module declares eight functions:
requestis a bit superfluous, as it’s possible to get the functionality of
requestby invoking one of the seven other functions. As you can see, each of the other seven functions corresponds to a HTTP verb.
- auth.py - This module is used internally within Requests to provide basic and digest authentication.
- cookies.py - Also used internally, this module provides cookie handling.
- models.py - This module provides classes such as Request, Prepared Request, and Response; in other words, the primary classes that users of Request are exposed to.
- sessions.py - Ties together models, cookies, authentication, and other trans-request settings so that settings do not have to be configured for each request.
- structures.py - Provides a case-insensitive dictionary implementation, which is used by case-insensitive headers.
There are a few other modules, notably a utils module, that provide additional functionality, but they are either used primarily by the library itself or do not offer interesting functionality. Of the latter, one such module is exceptions, which provides enumerated HTTP exceptions.
Let’s next take a look at the dependency graph of requests, excluding Python standard library imports, starting from the Requests module itself. A module’s dependencies are enumerated only the first time the module appears:
__init__.py: urllib3 (external package) chardet (external package) utils _internal_utils compat compat chardet cookies _internal_utils compat structures compat exceptions urllib3 certs certifi (external package) packages urllib3 chardet models urllib3 hooks structures auth compat cookies _internal_utils utils cookies exceptions urllib3 _internal_utils utils compat status_codes structures api sessions auth compat cookies models hooks _internal_utils utils exceptions structures adapters status_codes sessions status_codes exceptions
We can tell from looking at this dependency graph that; first, the library does not have any cyclic dependencies; second, it leans on three non-standard library external modules (chardet, idna, and urllib3); and last, it uses compat, utils, and _internal_utils across much of the library. This makes sense. Since Requests supported Python 2.6 and 2.7, as well as later Python 3 variants, all Python 2 and Python 3 specific code paths are contained with the compat module. Much of the module has to do with the differing string APIs in Python 2 versus 3, with Bytes/Unicode versus just Unicode.
The third-party packages each serve a singular purpose to the Requests library. urllib3 carries out much of the grunt-work of actually sending, waiting, and receiving the HTTP messages, connection pooling, and proxy handling. chardet is used in Requests to guess the content type of messages that were sent without a Content-Type/Charset header. Lastly, the certifi package, formerly packaged within requests itself, provides Mozilla’s collection of root certificates for SSL/TLS verification. It was removed from the Requests project so that it could be updated independently (hopefully more quickly) from Requests itself.
I believe Requests is a pleasure to use because it constrained the public interface to the absolute minimum. Programmers exploring new libraries often have one or two specific needs they’re trying to fulfill, and in this case the majority of those needs can be fulfilled by Request’s seven method API. Many users don’t want to read documentation. Such a tiny API obviates even that basic necessity. In the case that the user has special requirements from an HTTP library, the dependency graph expresses the order in which he might want to explore the Request package for customization. For example, the next module the user might want to take a look at would be sessions. Through a process of gradual discovery the user can become more and more adept at using the Request package.
In the next addition to this canon, I plan on taking a lot at another package that offers a minimalist interface: the C# ORM package Dapper.