Python Website "downloads page" returns binary data #2411

maltfield · 2024-03-16T00:19:23Z

Describe the bug

When attempting to curl or wget the downloads page, the web server returns binary data

https://www.python.org/downloads

To Reproduce

Execute either of the following commands in Debian Linux

curl --location 'https://www.python.org/downloads/'
wget 'https://www.python.org/downloads/'

Example execution:

user@disp897:/tmp/tmp.aQ3uHh4PqB$ curl --location 'https://www.python.org/downloads'
Warning: Binary output can mess up your terminal. Use "--output -" to tell 
Warning: curl to output it to your terminal anyway, or consider "--output 
Warning: <FILE>" to save to a file.
user@disp897:/tmp/tmp.aQ3uHh4PqB$

user@disp897:/tmp/tmp.aQ3uHh4PqB$ wget 'https://www.python.org/downloads/'
--2024-03-15 19:17:59--  https://www.python.org/downloads/
Resolving www.python.org (www.python.org)... 199.232.16.223, 2a04:4e42:41::223
Connecting to www.python.org (www.python.org)|199.232.16.223|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 19113 (19K) [text/html]
Saving to: ‘index.html’

index.html          100%[===================>]  18.67K  --.-KB/s    in 0.05s   

2024-03-15 19:18:00 (384 KB/s) - ‘index.html’ saved [19113/19113]

user@disp897:/tmp/tmp.aQ3uHh4PqB$ 

user@disp897:/tmp/tmp.aQ3uHh4PqB$ head -c256 index.html 
�}�r�F����*�CS�5����|�,;�؎'r���M�@$a����o���������'���ƥ�$(R�@�rD��s���ލ�?[�^/m6
                                                                               �у�?���t&l��g���1vD��97���z��s�.�;v_|
                                    �ǰƯ��?m�r&������e=pۓp-�����]���J��u�߭�r��L��h�567��q�vk�r���<�^�\y����mX����:{�yӹ�Bc�O��1x�user@disp897:/tmp/tmp.aQ3uHh4PqB$

Expected behavior
The pyhon.org webserver(s) should return HTML

The text was updated successfully, but these errors were encountered:

maltfield · 2024-03-16T00:21:13Z

As a workaround, adding the --compressed argument to curl fetches the HTML as-desired

user@disp897:/tmp/tmp.aQ3uHh4PqB$ curl --location --compressed 'https://www.python.org/downloads/' | head
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:00:01 --:--:--     0<!doctype html>
<!--[if lt IE 7]>   <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9">   <![endif]-->
<!--[if IE 7]>      <html class="no-js ie7 lt-ie8 lt-ie9">          <![endif]-->
<!--[if IE 8]>      <html class="no-js ie8 lt-ie9">                 <![endif]-->
<!--[if gt IE 8]><!--><html class="no-js" lang="en" dir="ltr">  <!--<![endif]-->

<head>
    <!-- Google tag (gtag.js) -->
    <script async src="https://www.googletagmanager.com/gtag/js?id=G-TF35YF9CVH"></script>
    <script>
 41 19113   41  8007    0     0   3748      0  0:00:05  0:00:02  0:00:03  3748
curl: (23) Failure writing output to destination
user@disp897:/tmp/tmp.aQ3uHh4PqB$

And setting --compression=gzip in wget is a workaround too

user@disp897:/tmp/tmp.aQ3uHh4PqB$ wget --compression=gzip 'https://www.python.org/downloads/'
--2024-03-15 19:22:27--  https://www.python.org/downloads/
Resolving www.python.org (www.python.org)... 199.232.16.223, 2a04:4e42:41::223
Connecting to www.python.org (www.python.org)|199.232.16.223|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 19113 (19K) [text/html]
Saving to: ‘index.html’

index.html          100%[===================>]  18.67K  80.3KB/s    in 0.2s    

2024-03-15 19:22:29 (80.3 KB/s) - ‘index.html’ saved [174854]

user@disp897:/tmp/tmp.aQ3uHh4PqB$ 

user@disp897:/tmp/tmp.aQ3uHh4PqB$ head index.html 
<!doctype html>
<!--[if lt IE 7]>   <html class="no-js ie6 lt-ie7 lt-ie8 lt-ie9">   <![endif]-->
<!--[if IE 7]>      <html class="no-js ie7 lt-ie8 lt-ie9">          <![endif]-->
<!--[if IE 8]>      <html class="no-js ie8 lt-ie9">                 <![endif]-->
<!--[if gt IE 8]><!--><html class="no-js" lang="en" dir="ltr">  <!--<![endif]-->

<head>
    <!-- Google tag (gtag.js) -->
    <script async src="https://www.googletagmanager.com/gtag/js?id=G-TF35YF9CVH"></script>
    <script>
user@disp897:/tmp/tmp.aQ3uHh4PqB$

hugovk · 2024-09-05T18:10:15Z

Are you trying to scrape the page? What info exactly are you after? Perhaps there's a better place to fetch that from.

For example, you can also find downloads at https://www.python.org/ftp/python/

maltfield · 2024-09-06T22:06:15Z

I was programmatically downloading the GPG keys listed on that page for 3TOFU, yes.

Why shouldn't this bug be fixed?

hugovk · 2024-09-07T07:45:40Z

I'm not saying it shouldn't be fixed, but I can't say when or if that will happen.

Anyway, I think that is the only source for GPG keys, so it's good you have a workaround for now.

maltfield mentioned this issue Mar 16, 2024

Fix Builds (python_gnupg-0.5.2-py2.py3-none-any.whl.asc 404 not found) BusKill/buskill-app#78

Closed

JacobCoffee added bug This is a bug! help-wanted The maintainers would welcome help with this issue labels Sep 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python Website "downloads page" returns binary data #2411

Python Website "downloads page" returns binary data #2411

maltfield commented Mar 16, 2024 •

edited

Loading

maltfield commented Mar 16, 2024 •

edited

Loading

hugovk commented Sep 5, 2024

maltfield commented Sep 6, 2024

hugovk commented Sep 7, 2024

Python Website "downloads page" returns binary data #2411

Python Website "downloads page" returns binary data #2411

Comments

maltfield commented Mar 16, 2024 • edited Loading

maltfield commented Mar 16, 2024 • edited Loading

hugovk commented Sep 5, 2024

maltfield commented Sep 6, 2024

hugovk commented Sep 7, 2024

maltfield commented Mar 16, 2024 •

edited

Loading

maltfield commented Mar 16, 2024 •

edited

Loading