Python

 

 

 

 

Python - HTTP - http.client

 

The http.client module in Python is a part of the Python Standard Library and provides a low-level, object-oriented interface for making HTTP requests and interacting with HTTP servers. It supports various HTTP methods such as GET, POST, PUT, DELETE, and others, as well as handling HTTP response status codes and headers.

 

http.client module has various components as listed below :

 

HTTPConnection class: This class represents a single connection to an HTTP server. You can create an instance of this class by providing the server address and an optional port number. The class provides methods for opening a connection, sending requests, and receiving responses.

 

HTTPSConnection class: This class is a subclass of HTTPConnection and provides support for HTTPS (HTTP over SSL/TLS). It adds encryption and server authentication to the regular HTTP connection. When creating an instance of this class, you can provide additional SSL/TLS-related parameters such as the certificate, key, and hostname verification.

 

HTTPStatus enumeration: This enumeration contains constants for all the HTTP response status codes, such as OK (200), BAD_REQUEST (400), and INTERNAL_SERVER_ERROR (500). These constants make it easier to work with status codes in a human-readable and self-explanatory way.

 

HTTPResponse class: This class represents the HTTP response returned by the server. It contains information about the status code, headers, and body of the response. You can use methods like read(), getheader(), and getheaders() to access the response data.

 

Methods for sending requests: The HTTPConnection and HTTPSConnection classes provide methods to send HTTP requests, such as request(), getresponse(), endheaders(), and send(). These methods allow you to specify the HTTP method, request headers, and request body.

 

Error handling: The http.client module defines several exception classes to handle errors that might occur during the HTTP communication process. Some of these exceptions are HTTPException, RemoteDisconnected, BadStatusLine, and IncompleteRead. These exceptions can be caught and handled in your application as needed.

 

Header parsing: The http.client module also provides utility functions for parsing HTTP headers, such as parse_headers(), which can parse a stream of headers into a dictionary-like object.

 

 

 

Examples

 

NOTE 1 : All the examples in this page are written in Python 3.x. It may not work if you use Pyton 2.x

NOTE 2 : All the examples in this page are assumed to be written/run on Windows 10 unless specifically mentioned. You MAY (or may not) need to modify the syntax a little bit if you are running on other operating system.

 

 

  • Establishing a connection to a url and print Status and Reason - Example 1
  • Establishing a connection to a url and print the conents of page in html code - Example 2

 

< Example 01 > ========================================================

 

import http.client

 

connHttp = http.client.HTTPSConnection("www.python.org")

connHttp.request("GET", "/")

 

res = connHttp.getresponse()

 

print(res.status, res.reason)

200 OK

 

 

 

< Example 02 > ========================================================

 

import http.client

 

connHttp = http.client.HTTPSConnection("www.python.org")

connHttp.request("GET", "/")

 

res = connHttp.getresponse()

contents = res.read()

 

print(contents.decode('utf-8', 'ignore') )

Contents of the page in html code

 

 

 

< Example 03 > ========================================================

 

import http.client

from http import HTTPStatus

from bs4 import BeautifulSoup

 

# Create an HTTPS connection to Google

conn = http.client.HTTPSConnection("www.google.com")

 

# Send a GET request to the root path

conn.request("GET", "/")

 

# Get the response

response = conn.getresponse()

 

# Check if the status code is OK (200)

if response.status == HTTPStatus.OK:

    # Read the response body

    data = response.read()

 

    # Print the response headers

    print("Headers:")

    for key, value in response.getheaders():

        print(f"{key}: {value}")

 

    # Parse and prettify the HTML content using BeautifulSoup

    soup = BeautifulSoup(data, 'html.parser')

    pretty_html = soup.prettify()

 

    # Save the prettified HTML content to a file named 'downloaded.html'

    with open('downloaded.html', 'w', encoding='utf-8') as file:

        file.write(pretty_html)

 

    print("Prettified HTML content has been saved to 'downloaded.html'.")

else:

    print(f"Request failed with status code: {response.status}")

 

# Close the connection

conn.close()

Headers:

Date: Sun, 16 Apr 2023 16:22:08 GMT

Expires: -1

Cache-Control: private, max-age=0

Content-Type: text/html; charset=ISO-8859-1

Content-Security-Policy-Report-Only: object-src 'none';base-uri 'self';script-src 'nonce-XyRmHyTVM3hQ3RQp3VlqFA' 'strict-dynamic' 'report-sample' 'unsafe-eval' 'unsafe-inline' https: http:;report-uri https://csp.withgoogle.com/csp/gws/other-hp

P3P: CP="This is not a P3P policy! See g.co/p3phelp for more info."

Server: gws

X-XSS-Protection: 0

X-Frame-Options: SAMEORIGIN

Set-Cookie: 1P_JAR=2023-04-16-16; expires=Tue, 16-May-2023 16:22:08 GMT; path=/; domain=.google.com; Secure

Set-Cookie: AEC=AUEFqZc6rRiXsbULSQo6GdbOtK1olwZ42Ay1pJtwdHDgHBd3f0OCHvfVjS0; expires=Fri, 13-Oct-2023 16:22:08 GMT; path=/; domain=.google.com; Secure; HttpOnly; SameSite=lax

Set-Cookie: NID=511=oDmFl26G1RKxsiNwaiwhqvnhjnqQcRts15nZZOsRx7ArJ-JG6vSBNYwarlCmsN1fIIfPTRt91PCiYnHeYSAyvIWDWPC2LCtXQlmAZaUnX0vtdnppGjyAV-I8i6qdkR6TIm3gFD81aUeVMPSr6MfTGO50J-wWy9Bd4Ufs-S8XGb8; expires=Mon, 16-Oct-2023 16:22:08 GMT; path=/; domain=.google.com; HttpOnly

Alt-Svc: h3=":443"; ma=2592000,h3-29=":443"; ma=2592000

Accept-Ranges: none

Vary: Accept-Encoding

Transfer-Encoding: chunked

Prettified HTML content has been saved to 'downloaded.html'.